7
3 Comments

The workflow test for finding strong AI ideas

Plenty of big companies are moving fast with AI.

But they often don’t ship the strongest versions of certain AI products — not because they’re slow, but because they’re constrained.

They optimize for:

  • "One product for millions of users"

  • Protecting their brand

  • Legal/regulatory exposure

  • Enterprise procurement and security reviews

  • Internal alignment and incentives

This creates gaps.

You can build something small and specific that works well for a single task.

Here’s how to find the gaps.

Step 1 – Pick who you’re building for

Start with a real job title.

Pick:

  • One role (real job title)
  • One “home base” where they already work (a tool or system: Gmail, HubSpot, Salesforce, Zendesk, Notion, Sheets, etc.)
  • One broad area of work (prospecting, reporting, screening, support replies, onboarding, etc.)

For example: an SDR inside HubSpot, working on outbound prospecting.

This gives you a clear place to start looking for gaps.

Step 2 – Make sure AI is actually useful here

AI works best when the work is:

  • Repeated often
  • Text-heavy
  • Rule-based (checklists, rubrics, “if X then Y”)
  • Context-heavy (docs, history, fields)
  • Handed off to someone else (manager/client/other team)

If none of these are true, the “gap” usually won’t be strong enough.

Step 3 – Find the gap

Pick one tool your user already uses.

A general AI can do a lot one time if you paste everything in.

The gap is when it needs to work every week, for a team, inside the real tool, with little cleanup.

Now do this:

A) Write down the real deliverable

Ask: What do they hand to someone else when the work is done?

If you can’t name the deliverable, you don’t have a product idea yet.

B) Check three places in the tool

  • Templates
  • Settings
  • Export / integrations

Ask: “Can this tool make the deliverable inside the tool, with almost no cleanup, every time?”If not, you’ve found a gap. Now, name why.

C) Name the reason

Most general tools fail because of one of the following:

  • Rules: it doesn’t follow the rules the same way each time
  • Company context break: it doesn’t use your docs, terms, policies, or fields right
  • Format break: it doesn’t give the right format
  • Workflow break: can’t do all steps across tools
  • Proof or audit break: it doesn’t show why it chose the answer

Pick the biggest one.

D) Write 3–5 lines like this:

“Tool does X, but fails at Y. I’ll build Y for \[role\] in \[tool\].”

E) Score each idea. Add 1 point if:

  • It happens every week
  • Inputs are easy to get
  • The output is used for real work
  • It needs little fixing
  • Value is easy to prove

Pick the one with 4–5 points.

Step 4 – Make sure someone will actually use it

Do this (in order):

  • Try it on your own work (if you’ve done the job)
  • Find proof online (people complaining / doing it manually)
  • Do some lightweight outreach: DM 5 people with one question
  • If someone is interested, offer a small paid pilot: build the smallest version using their real inputs

Stop when 2–3 people say, “Yes, I’ll try it,” and share real inputs. If one of them agrees to a small paid pilot, even better!

on March 11, 2026
  1. 1

    The deliverable test is underrated. We killed half our roadmap once we asked what gets handed off at the end. Biggest gap we found was workflow breaks — AI does each step fine alone but cant chain them inside tools without manual glue.

  2. 1

    The failure mode you listed first, "rules: it doesn't follow them the same way each time", is almost always a prompt structure problem. When constraints are buried inside the objective and context as one block of prose, the model treats them as soft preferences and weighs them differently run to run.

    Separating constraints into a dedicated typed block changes that. The model parses them independently. Rules stop drifting across sessions.

    I've been building flompt for exactly this, a visual prompt builder that decomposes prompts into 12 semantic blocks and compiles to Claude-optimized XML. Open-source: github.com/Nyrok/flompt

  3. 1

    starting small is the way, 100% agree

Trending on Indie Hackers
Stop Spamming Reddit for MRR. It’s Killing Your Brand (You need Claude Code for BuildInPublic instead) User Avatar 188 comments What happened after my AI contract tool post got 70+ comments User Avatar 140 comments Where is your revenue quietly disappearing? User Avatar 58 comments How to build a quick and dirty prototype to validate your idea Avatar for Aytekin Tank 53 comments The Quiet Positioning Trick Small Products Use to Beat Bigger Ones User Avatar 40 comments I Thought AI Made Me Faster. My Metrics Disagreed. User Avatar 38 comments