We are Ojin, The Human AI Company, a small Berlin team. For a long time the same thing held AI agents back: they had a tell. The face lagged, the voice went flat, and some part of your brain knew it was not real.
Our flagship, Human Agents, is an instant end-to-end app for natural, human conversation with an AI that has a real face and a real voice, in real time. Under it sit two developer-tier face models you can use through one API: Oris Portrait (fastest and most scalable) and Oris Presence (the most sophisticated, the first to break the uncanny valley). Sub-200ms, works with Pipecat, LiveKit, and any stack.
You can see it at ojin.ai. I would genuinely love feedback from this community on where it still feels off: the latency, the expressions, the turn-taking, the moments the illusion cracks. That is what we are working on next, and outside eyes catch what we have gone blind to.
Building on the presence-vs-work split above, there's a buyer-side version of that same fork. Right now this reads as dev-tier self-serve (API key, Pipecat, LiveKit, "try it and tell us where it breaks"). But the use cases that actually need this most: support, scheduling, ops replacing a human agent get bought by a 200-person team through security review and an SLA conversation, not a signup form. Those two motions want almost opposite front doors: one wants zero friction, the other wants a sales call and a trust page. Worth deciding which team size you're building the next six months of product for, because "developer API" and "enterprise support infra" pull the roadmap in different directions fast.
The uncanny-valley framing might be aiming at the wrong axis for part of your market. "Remove the tell so users can't clock it as AI" is exactly right if the job is presence — companionship, entertainment, a face people want to feel something toward. There, undetectable is the product. But for AI agents doing actual work (support, scheduling, ops), my hunch is a lot of users don't actually want the mask removed — they'd rather know it's an AI and trust that it's doing the task, and I'd bet "this is an AI, and here's what it just did" often reads as more trustworthy than a flawless human face. Support is the messy middle where you probably want both at once — rapport and legibility. So the failure that would matter to me isn't a gaze or timing artifact — it's a use-case one: where is "passes as human" the actual goal, and where does it quietly cost you trust? Curious which end you're aiming at first, because the feedback you'd want back is completely different for each.
This is one of those products where the real competition isn’t other apps—it’s human perception thresholds. Once you hit low latency, the remaining failures stop being “AI issues” and become subtle timing, gaze, and turn-taking artifacts that users interpret emotionally rather than technically. That makes feedback loops especially valuable at this stage.