The most useful UX test you’re not doing

As founders, we can be bad at testing our own products. We already know the flows. We know where to click. We know what each screen means.

New users don’t.

And that’s where most products break. Not bugs — UX.

AI can help you see your product like a stranger again.

Here’s how.

Level 1: You click, AI reviews

Let’s start with the simplest setup. Then, we’ll look at more advanced ones later.

Open your app in a private window. Pretend you're a brand-new user. Do one thing a first-time user should do, like sign up and send an invoice.

Record your screen while you do it (Mac/Windows has this built-in).

Then, drop that video into any AI chat tool you use. Ask it to act like a confused new user and tell you:

What feels unclear?
What would make you bounce?
What’s missing or awkward?

That’s it.

You’ll spot small UX issues that cost you users, and you’ll know exactly what to fix.

Level 2: Your script clicks, AI reviews

Now let’s make it faster and more repeatable.

This version utilizes test scripts to automate clicks and allows AI to perform the review.

You’ll need a test tool like:

Playwright
Cypress
Selenium

(If you already have automated tests, you’re halfway there.)

The idea:

Your test script walks through the flow
It saves the text from each screen
AI reviews the text like a new user

What to change in your test

After each step, save the text from the screen.

In Playwright, that might look like this:

const body = await page.textContent('body');
fs.appendFileSync('flow-log.txt', \`\\nStep 3:\\n${body}\`);

Do that for every screen.

You’ll end up with a plain text log like:

Step 1 - Homepage  
Step 2 - Signup form  
Step 3 - Dashboard  
Step 4 - Invoice form  
Step 5 - Confirmation

Ask AI for feedback

Paste that log into ChatGPT, Claude — or any other tool of your choice — with this prompt:

Act like a new user. 
Here's what the app shows, step by step. 
Tell me where the flow is unclear or confusing. 
Say what you would do at each step.

Now the AI reviews the flow — just like Level 1 — but with no manual clicking.

You can do this:

After a big change
Before shipping
During onboarding reviews

You’ll get fast, repeatable feedback — every time the test runs.

Level 3: Automate it all

This is where the AI does everything:

Opens your app
Clicks around
Fills in forms
Checks what happens
Tells you what’s broken or confusing

It acts like a QA engineer — but faster and tireless.

You’ll find these in:

QA platforms with AI built-in
Your CI/CD pipeline (if you integrate tools)

This needs:

Engineering time
Setup and test design
Good test coverage

But once it’s live, it’s powerful:

Catches regressions
Flags confusing UX
Saves human QA time

This is worth doing when:

You have multiple flows to test
You ship often
You want full regression coverage

How to set it up

You have 2 options:

Option 1: Use a QA platform with agents

These tools give you built-in agents:

QA Wolf
Reflect.run
Testim
Walnut
Others

What to do:

Sign up
Record or define your test flow
Let the agent run it
Get alerts when it breaks

No code needed for most of these.

Option 2: Do it with code

If you have dev resources:

Use Playwright or Selenium to write the test flow
Plug in an AI model to read results and flag issues
Add it to your CI/CD pipeline (so it runs on every deploy)

It takes time to set up, but you get full control.

That's it.

Quick note: Don’t jump straight to agents.Only invest in them once:

You’ve fixed the basics
You know what good UX looks like for your first-time flow
You’ve already used Level 2 a few times

If not, you may end up automating chaos.

Aytekin Tank

on February 4, 2026

Say something nice to aytekin…

Post Comment

2

It is very practical to turn “UX testing” into something that is not a research project. ~

I did something similar lately: I recorded my first-time flow, fed it to an AI and asked it to act like a confused first-time user. Feedback was very straightforward.

I’m not sure what this button does.

I’m uncertain about whether this worked

I'm scared to click this.

No color, no font, nothing. Only Transparency and Trust.

I enjoy the growth occurring at this site.
Press the click option on the script to automate.
It reflects how products evolve. Initially, you simply require a new perspective. Ultimately, you want repeatability. Sooner or later, you will need guardrails.

Using this before shipping changes is an underrated insight. We often see problems with UX only when our users resign or complain. Conducting this assessment on onboarding, pricing, and initial action sequences would likely identify 80% of expensive frictions.

What would make you leave? is a good question.This is a fantastic prompt. This question makes you look at the product as a stranger again.

Have you found certain flows where there are consistently more problems (signup vs onboarding vs first task)?

MORPHOICES

·
5 months ago
·
Reply
2

Level 1 approach is a game-changer for solo developers. Just recording yourself as a fresh user and asking ChatGPT 'what would confuse you here?' catches so many small friction points that you've gone blind to.

One thing I'd add: the mistake simulation point from the comments is crucial. Most people test the happy path. But the real UX breaks when users do things slightly wrong - fill fields in unexpected order, click before pages fully load, or misunderstand what a button does. When you ask AI to narrate its confusion ("Should I click X or Y?"), that's where you find the real issues.

The Playwright approach at Level 2 is solid too for repeatable testing across feature changes. The plain text log idea is simple but genius - no need for complex reporting tools, just AI reading what users see.

One question though: How well does this catch issues that appear only under specific conditions (slow network, on mobile, etc.)? Seems like something to layer in once the basic flow is smooth.

anwaarulhaque

·
5 months ago
·
Reply
1

Several people in this thread hit on the same thing: AI feedback simulates confusion. It doesn't actually experience it. A real person who has never seen your app will do things no model would predict. Click the wrong button. Misread your pricing. Get stuck somewhere you never thought to test.

We built Test by Human for this exact gap. You submit your URL, tell us what flow to test, and a real person goes through your site for the first time while screen-recording with voice narration. You get back a video showing where they hesitated, what confused them, what they skipped entirely.

I'd call it Level 0 in this framework. Before you automate anything, just watch one stranger use your product. That five-minute video will tell you more than a week of staring at analytics.

First test is free if anyone wants to try it. Find us @testByHuman on X.

testByHuman

·
5 months ago
·
Reply
1

This is a great breakdown especially the reminder that founders rarely experience their own product like real users do.

I’ve seen this play out when analyzing funnels: conversion drops often aren’t caused by technical bugs, but by small clarity gaps unclear wording, hidden expectations, or cognitive overload during first interaction.

The Level 1 approach is surprisingly powerful because it forces perspective shifting before introducing automation complexity. I like the progression you outlined it prevents teams from over-engineering validation too early.

Curious about your experience here:
Have you noticed AI feedback aligning closely with actual user behavior metrics (drop-offs, time-to-completion, etc.), or do you treat it more as directional insight rather than validation?

Thanks for sharing this , very practical framework.

NetCase

·
5 months ago
·
Reply
1

The Level 1 approach is underrated. Recording yourself and asking AI to narrate confusion catches so many blind spots. But there's a gap: AI simulates how users might behave, not how they actually behave.

We built SaasFeedback (https://saasfeedback.ai/) specifically for this — real user feedback loops that capture actual friction points during onboarding. Combining it with AI simulation gives you both perspectives: what users could struggle with and what they actually do.

Curious if you've tested combining AI simulation with real user session data to validate which issues actually cause churn?

wawa_story

·
5 months ago
·
Reply
1

this is super practical honestly. the level 1 approach alone would catch so many issues founders just gloss over because they're too close to the product. we built something similar internally where we record first-time user sessions and it's wild how different the experience looks when you're not the one who designed it.
do you find the ai feedback is more useful on visual stuff like layout/button placement or more on copy/messaging clarity?

Taayjus

·
5 months ago
·
Reply
1

Level 2 resonates with me. I've been using Playwright for a tech news aggregator I'm building, and capturing the text content at each step has been eye-opening.

One thing I learned: asking AI to evaluate "what's the first thing a confused user would try to click" often reveals navigation gaps that functional tests miss entirely.

The key insight here is that AI isn't replacing user testing — it's catching the obvious stuff faster so real user conversations can focus on deeper problems.

yamamoto7

·
5 months ago
·
Reply
1

This is great advise. Everyone talks about UX, but usually things come ignoring just that, out of the gate. I feel things will get better in general thanks to AI models offering best practices, assuming one asks...but a founder is usually so busy with so much stuff that they might not even ask.
Also an interesting idea to ask an AI to give its opinion, though I'm not sure how much that would match a regular user. Also, this is (I think) limited to the web, at least for now. Not apps, unless one uses screenshots (a video would likely already taint the outcome). I'll definitely run a few experiments

PhilipFox

·
5 months ago
·
Reply
1

That's great. I have a question. What AI tools do you recommend for UX testing?
In case it has lots of pages, how can you test them all? Or just one by one manually?
I know there might be some issues when you extend pages. Thanks

anydev1103

·
5 months ago
·
Reply
1

This is a really practical way to “see” your product with fresh eyes. I tried something similar by recording my own flow once, and it was shocking how many small confusions I had missed. Love how you break it into levels — super doable even for small teams.

bhavin_allinonetools

·
5 months ago
·
Reply
1

I kinda agree . . . but also think it makes it all too mechanical versus intuitive . . .

CRZAnthony

·
5 months ago
·
Reply
1

Great breakdown. One thing I've learned from onboarding tests: the confusion usually isn't where you think it is.

Most founders test the "happy path" - when everything works. But new users get confused when something doesn't work, and they can't tell if it's their fault or a bug.

Level 1 tip: When you record yourself, try to intentionally make a small mistake (like filling a field wrong) and see what happens. That's where most users bounce.

The AI feedback works best when you ask it to narrate its internal confusion - not just "this is unclear" but "I'm not sure if I should click here or scroll down first."

demogod_ai

·
5 months ago
·
Reply
1

putting to use immediately! thanks for the tips!

austinparker

·
5 months ago
·
Reply