Technical deep-dive: how our AI generates Playwright tests

by Anwar

A few people have asked how the AI test generation in ObserveOne actually works, so here's the honest breakdown:

It doesn't just hallucinate tests and hope for the best. There's an actual pipeline:

It crawls your app like a user would — maps pages, interactions, the whole thing
Picks out the flows that matter (login, checkout, search, CRUD)
Generates standard Playwright tests — nothing proprietary
Runs them immediately and fixes any failures on its own
Self-heals when your UI changes down the road

The output is just normal Playwright code. You can read it, edit it, export it, run it locally. No vendor lock-in at all.

Honestly, building this was the hardest thing I've worked on. The crawling alone took 3 months to get right. But the end result is tests that actually work in production, not just in a demo environment.

Happy to answer any technical questions. What would make you trust — or not trust — AI-generated test code?