Hey guys,
We all know the drill: "Talk to your users." But when you're bootstrapping, blocking out 15 hours a week for Zoom calls across 5 time zones is brutal. On the flip side, sending out a Google Form feels like a massive cop-out. You get the "what" but completely miss the "why."
So I spent the last few weeks building an AI that conducts qualitative user interviews via chat. Not to pitch anything here, just wanted to share some of the technical rabbit holes I fell into. Because as it turns out, getting an LLM to actually act like a competent human researcher is much harder than just throwing an API key at it.
Real insights come from probing. To fix this, I had to stop treating the LLM like a chatbot and wrap it in a state machine. Now, the system evaluates every user response against the core research goal before deciding its next move. It scores whether the user gave a surface-level answer or a deep one, and forces a follow-up (e.g., "What specifically about the UI felt off?") until it hits a depth threshold.
Building dynamic guardrails was a headache. I had to tweak the prompt architecture heavily so the AI knows how to be empathetic ("Oh, sorry to hear your dog is sick") while immediately pivoting back to the actual goal ("...but regarding how you handled that data export last Tuesday...").
My workaround was implementing a rolling summary pipeline. It compresses older parts of the conversation into a dense "what we've learned so far" block, keeping only the last few exchanges as raw text. It drastically reduced hallucinations and kept the AI hyper-focused.
It’s been a crazy challenging build. The baseline tech is finally good enough to handle the nuance of qualitative research, but I realized the UX layer and the architecture around the LLM is where 90% of the actual work lives.
Curious if anyone else here has tried automating their qualitative research? Did you hit the same walls?