Parloa builds service agents customers want to talk to
Parloa builds service agents customers want to talk to Parloa uses OpenAI models to simulate, evaluate, and run voice-driven customer service systems for the enterprise. In Parloa’s early days, Co-founder Stefan Ostwald spent a day inside an insurance call center, where his team had been building early voice experiences. Sitting alongside agents, he listened to the same conversations play out again and again: password resets, policy questions, routine changes. He realized much of that work could be automated. After that experience, Berlin-based Parloa(opens in a new window) began building rule-based voice agents to automate high-volume customer interactions. With the emergence of ChatGPT, the company evolved to build what is now its AI Agent Management Platform (AMP), built on a new generation of models including GPT‑5.4. AMP gives enterprises a way to design, deploy, and manage customer service interactions at scale. Instead of mapping out rigid intents and flows, teams define behavior in natural language, connect to internal systems, and iterate quickly using built-in simulations and evaluations. Parloa runs these interactions end to end, handling everything from simple routing to complex, multi-step requests. The focus is on consistency in production, where performance, latency, and edge cases all matter. To get there, Parloa continuously tests models against real customer scenarios before deploying them. “The models only matter if they work in production. We work closely with OpenAI on how to make the models fast and reliable enough for real-time conversations.” Parloa’s Agent Management Platform (AMP) is designed for business users and subject matter experts to be able to build AI agents without writing code. “With AMP, we can have subject matter experts from different business units actually build the agents and connect the APIs in a much leaner and simpler way,” says O’Reilly. At a high level, AMP allows brands to manage the entire AI agent lifecycle. It does that by giving non-technical teams a simpler way to define how an agent should behave before it ever goes live. Instead of writing code or mapping rigid intent trees, subject matter experts set the agent’s role, instructions, tools, and boundaries in natural language. That configuration becomes the basis for how the model is prompted and how the system behaves in production. Once defined, the agent is tested before deployment. Parloa simulates customer conversations using models like GPT‑5.4, with one model acting as the caller and another running the configured agent. Teams can inspect these interactions directly, test changes against realistic scenarios, and iterate before going live. The same models are then used to evaluate those conversations using a mix of deterministic checks and LLM-as-a-judge scoring. This shows whether the agent followed instructions, used tools correctly, and completed the task as expected. During a live conversation, AMP’s orchestration layer prompts an OpenAI model with the agent configuration and conversation context to generate a response, retrieve information through RAG, or trigger tools to interact with customer backends. Parloa continuously updates this layer with the latest generation of models as they demonstrate clear gains in real…

