14
6 Comments

The hidden bugs killing your AI product (and how to catch them)

Your AI product was working fine yesterday. Today, users are complaining it’s spitting out garbage.

The model didn’t suddenly get dumber. Something in your data, pipeline, or setup broke — quietly.

Here’s how you can figure out what went wrong with your AI product and fix it fast.

Step 1. Reproduce the problem first

Before changing anything, confirm exactly what’s broken.

What to do:

  • Get 3–5 cases where users reported wrong answers. Note the exact input and the expected output.
  • Run those inputs through your app manually and see what results you get.
  • Compare your app’s outputs with what you expected.
  • Keep a short “Broken Cases” list in Google Sheets or Docs. You’ll reuse these examples later to test your fixes.

Step 2. Check your inputs before blaming the model

Most AI “bugs” come from bad inputs, not the model itself.

What to check:

  • Are any fields missing? (e.g., empty user IDs, blank text fields)
  • Are timestamps wrong or in the wrong format?
  • Are you feeding the model the right language or labels?
  • If you’re using embeddings:
    • Are you actually fetching the right chunks from your vector database?
    • Is your similarity search threshold set correctly?

How to spot bad inputs quickly:

  • Log 10 random inputs into a Google Sheet.

  • Include columns for:

    • Where the data came from (API, form, database, etc.)
    • The exact input text
    • Pre-processed version sent to the model
  • Highlight anything that looks weird or empty.

Automation tip:

  • Use Zapier to automatically send every new input into Google Sheets:

    • Create a Zap → Choose your app as the trigger.
    • Select “New Input” or “New Row” depending on your source.
    • Add an action → “Add Row to Google Sheets.”
    • Zapier will log each input automatically so you don’t do it manually.

Step 3. Test your model API calls directly

Sometimes your app is fine — it’s the integration or wrapper that’s wrong.

What to do:

  1. Take one failing example from Step 1.
  2. Call the model API directly using Postman or curl.
  3. Compare:
    • The raw API response
    • What your app showed the user

How to interpret results:

  • If the raw API response looks good → your bug is in your code.
  • If the raw API response is wrong → the issue is upstream (model settings, embeddings, or data).

Step 4. Trace dependencies one by one

AI apps rely on many moving pieces: vector DBs, APIs, caches, spreadsheets, etc. One broken link breaks everything.

What to do:

Work backward from the final output:

User Output → Vector DB → Storage → API → Model → Input Source

At each step:

  • Send a test request.
  • Log the raw response.
  • Compare expected vs. actual.

For example:

  • If vector DB results are empty → your embeddings may not be updating.
  • If an external API returns a 429 (rate limit) → throttle requests or retry.
  • If cache returns old data → clear cache and retest.

Automation tip:

  • Use Airflow to schedule a daily dependency check:

    1. Install Airflow: pip install apache-airflow

    2. Create a Directed Acyclic Graph (DAG), which is basically a workflow in Airflow, that:

      • Hits your vector DB
      • Calls your model API
      • Checks for missing or empty fields
    3. Set alerts: If any check fails, Airflow emails or Slacks your team.

This turns debugging from a panic into a morning checklist.

Step 5. Build a simple debug dashboard

This step helps you spot problems early, before users start complaining.

We’ll keep it simple and use Google Sheets.

How to build it step by step:

1. Create a new Google Sheet

Open Google Sheets and create a blank sheet.

2. Add four columns

Name them:

  • Input: the data or prompt you send to the model
  • Output: what the model returns
  • API Response Time: how long the request took (in seconds)
  • Error Flag: shows if something failed (e.g., “true” or “false”)

3. Use Zapier to send data automatically

Instead of filling the sheet manually, you can have Zapier do it:

  1. Go to Zapier and create a new Zap.
  2. Set your app or API as the trigger (e.g. “New API Call” or “New Log Entry”).
  3. Add an action → choose “Add Row to Google Sheets.”
  4. Pick the sheet you just created.
  5. Map the fields: input, output, API response time, and error flag.
  6. Test the Zap and turn it on.

From now on, every time your app processes a request, Zapier will automatically add a new row to the sheet.

4. Set up conditional formatting

This makes problems stand out visually:

  • For API Response Time

    1. Select the entire “API Response Time” column.
    2. Go to Format → Conditional formatting.
    3. Under “Format cells if…”, choose Greater than.
    4. Enter 2 (meaning anything slower than 2 seconds).
    5. Pick a red highlight.
  • For Output

    1. Select the “Output” column.
    2. Go to Format → Conditional formatting.
    3. Under “Format cells if…”, choose Is empty.
    4. Pick an orange highlight.
  • For Error Flag

    1. Select the “Error Flag” column.
    2. Go to Format → Conditional formatting.
    3. Under “Format cells if…”, choose Text is exactly.
    4. Type true or error.
    5. Make it bold red.

5. Review the dashboard daily

Spend 5 minutes each morning checking the sheet:

  • If you see a red cell in the “API Response Time” column → your app is slowing down.
  • If you see orange cells in the “Output” column → some requests returned nothing.
  • If the Error Flag shows red → something failed, and you know exactly where to start debugging.

Optional upgrade

If your app is bigger or has lots of requests, switch to a proper monitoring tool later, like:

  • Metabase (for easy database dashboards)
  • Grafana (for live monitoring and alerts)

But start simple with Google Sheets first.

on October 16, 2025
  1. 1

    Good post — really helpful breakdown. I’m still pretty new to building AI stuff, and I’ve already run into a few of the bugs you mentioned. The one that hit home for me was data drift — I didn’t even realize at first that a small preprocessing change could throw everything off later.

    I’ve also learned the hard way how important versioning and logging are. Just having visibility into what changed between runs makes a huge difference when something suddenly stops working.

    Anyway, I appreciate how you framed it as debugging the whole pipeline, not just blaming the model. That clicked for me. Thanks for writing this.

  2. 1

    Great breakdown, Aytekin! Your step-by-step approach to debugging AI issues is super practical—love the focus on checking inputs first, as I’ve seen bad data mess up my MCP tool’s outputs too. Thanks for sharing such a clear and actionable guide!

  3. 1

    Thanks for this! This is actually a perfect reminder for me as well.

  4. 1

    Excellent breakdown, Aytekin 👏
    Most people blame the “model” when issues pop up, but you’re absolutely right — the real culprits are often data pipelines and integrations.
    I especially liked your tip about using Google Sheets + Zapier for lightweight monitoring — practical, low-friction, and scalable.
    This post is a great reminder that debugging AI systems needs the same discipline as traditional software — just with more moving parts. Thanks for sharing this! 🚀

  5. 1

    Really solid breakdown, this is exactly the kind of checklist every AI team needs. Debugging AI issues can be so tricky since it’s rarely “the model’s fault” but something hidden in the pipeline or data flow.

    We ran into something similar while refining Faceseek, our AI that traces digital identities and connections. One small change in how data was fetched completely shifted output accuracy your point about checking inputs first really hits home. Appreciate how clearly you outlined this process! 👏

Trending on Indie Hackers
I spent $0 on marketing and got 1,200 website visitors - Here's my exact playbook User Avatar 41 comments Why Early-Stage Founders Should Consider Skipping Prior Art Searches for Their Patent Applications User Avatar 22 comments I built eSIMKitStore — helping travelers stay online with instant QR-based eSIMs 🌍 User Avatar 20 comments Codenhack Beta — Full Access + Referral User Avatar 20 comments Veo 3.1 vs Sora 2: AI Video Generation in 2025 🎬🤖 User Avatar 18 comments Day 6 - Slow days as a solo founder User Avatar 13 comments