How to debug AI agents step by step (without losing your mind)

Your AI agent worked yesterday. Today, it’s giving the wrong output.

Maybe it’s hallucinating. Maybe it skipped a step. Maybe it broke the format and crashed something.

Here’s how to figure out what went wrong — and fix it without rebuilding everything.

Step 1: Save everything — exactly as it was when it broke

Before you do anything, stop and capture the failure.

What to save:

The input (what the user gave it)
The output (what the agent responded with)
The prompt or instructions you used in the setup
The model (GPT-5? Claude? etc.)
Any tools, files, or retrieval docs it used

Put all of this in a folder called something like: broken\_case\_01/.

This gives you a clear and stable snapshot to test and fix.

Step 2: Run the same thing again. See if it breaks the same way.

You’re checking whether this bug is random or if it happens every time?

To do that, use:

Same input
Same prompt
Same model
Same settings.

Then run it again.

If it fails the same way, good. It’s reproducible.
If it gives a different result each time, it’s unstable.

If it’s unstable:

Lower the temperature setting to 0.0 (less randomness). You can do this in the OpenAI Playground by adjusting the “temperature” slider, or in code by passing temperature=0.0 when calling the model.
Make your prompt more specific
Tell it exactly what you want it to return (e.g. “Return only this JSON format. No extra words.”)

Once it breaks consistently, you can start to fix it.

Step 3: Break the task into steps on paper

Don’t try to debug the entire agent.

Write down what your agent is supposed to do — as a list of steps.

For example, say it summarizes job descriptions from a PDF. That might look like this:

Read the PDF
Extract the job title
Extract the salary
Write a summary paragraph
Format everything as JSON

Now, test each of those steps one at a time in a safe, isolated environment — like the OpenAI Playground, Claude Console, or a Jupyter notebook.

Just copy in the raw text and ask:

“What’s the job title?”
“What’s the salary?”
“Write a one-paragraph summary.”
“Format all of this as JSON like this: {…}”

Somewhere along the line, one of those steps will probably fail.

That’s the part you need to fix.

Step 4: Fix the smallest possible thing that’s broken

Now that you know which step failed, don’t rewrite everything.

Just fix that part.

Here are common things that break — and how to fix them:

Problem: It gives the wrong format

For example, you told it to return JSON, but it gave you a paragraph, Markdown, or added extra explanation.

How to fix it:
Add these lines to your prompt:

Return only this exact JSON format:{ "title": string, "summary": string }
Do not include any extra text.Do not explain your answer.
Do not wrap the JSON in Markdown (no \`\`\`).

That’s it.

Be very strict in your wording. Models are literal — they’ll follow instructions better when you’re precise.

Problem: It makes things up when it doesn’t know

For example, you asked for salary info, but it guessed a number instead of saying it doesn’t know.

**How to fix it:**Tell it exactly what to do when it’s unsure:

If you don't know the answer, write "unknown".
Do not guess.
Do not make anything up.

This works surprisingly well — but only if you’re very direct.

Problem: It skips parts of the task

For example, you asked for a title and summary, but it gave you just one.

How to fix it:

Split your prompt into smaller steps.

Instead of one big prompt, do this:

First prompt: “Extract the job title from this text.”
Then take that result and prompt again: “Now write a summary using this title and the original text.”

It’ll work better. Plus, small steps are easier to debug and control.

Step 5: Test the fix on the same broken input

Go back to the original input that failed.

Run it again, using your fix.

Ask yourself:

Did it now give the correct output?
Did it give you the right format?
Did it stop hallucinating or skipping steps?

If so: You fixed the bug.

If not: Keep tweaking that one step until it passes.

Sometimes one prompt works for 90% of cases but breaks on edge cases. That’s fine.

In those situations:

Keep your original prompt for most inputs
Create a second version of the prompt for the edge case

Then add simple logic to decide which one to use.

Example: If your agent fails when the input is missing a field (like salary), you might:

Use the main prompt by default
Use a fallback prompt when salary isn’t mentioned — to avoid hallucination

In code, this might look like:

if "salary" not in input\_text.lower():    
prompt = fallback\_prompt
else:    
prompt = main\_prompt

No-code? Set up manual rules in tools like Zapier, Make, or even Google Sheets.

Step 6: Save this fixed example as a permanent test

This is now a known bug that you fixed.

Don’t delete it. Save it to a folder like test\_cases/.

If you're writing code, turn it into a test.

Example:

def test\_invoice\_agent\_returns\_correct\_fields():    
input = open("test\_cases/job\_description\_01/input.txt").read()   
 output = run\_agent(input)    
assert "title" in output    
assert "summary" in output

Even if you’re not technical, you can copy-paste the input/output manually and double-check it.

Now every time you change something, run this test again.

If it breaks again — you know instantly.

Step 7: Track bugs as they come up

Make a document or Notion board called: “Agent Bugs”

For each issue, write:

What broke
What caused it
How you fixed it

Example:

Bug: Agent returned Markdown instead of JSON  
Fix: Added "return only JSON, no extra text" to prompt
Bug: Salary field was hallucinated  
Fix: Added "if you don't know, return null"

This becomes your AI bug tracker. Over time, you’ll start to see patterns.

You’ll get faster at fixing them — or avoiding them completely.

Step 8: Add a fallback in case it fails again

Sometimes, even after fixing it, your agent will still mess up.

So you need to build a fallback system — something that kicks in when your agent fails.

Here's how to do it:

Check the output after the agent responds.
- Is it in the right format?
- Are key fields (like “title” or “summary”) missing?
If it fails validation, automatically retry with a clarifying prompt.
- “You forgot the salary field. Please return all fields in JSON.”
Still broken? Escalate.
- Show a user-facing message: “We couldn't process this right now. A human will review it.”
- Or queue it for manual review (via email, Slack, Notion, etc.)

You can implement this with:

Simple if-else checks in your code
Schema validators (e.g. Pydantic or JSON schema)
Retry loops with correction prompts
Logging and alerting for manual handoff

That way, if it fails, it fails gracefully — and knows what went wrong —

instead of breaking the product.

Step 9: Add a kill switch (just in case)

If your agent starts doing dangerous things — hallucinating data, leaking private info, or just failing at scale — you need a way to shut it down instantly.

If you’re using code, add this:

if AGENT\_DISABLED:    
return "Agent is temporarily offline. Please try again later."

Now you can turn it off by changing one setting, instead of having to panic and fix it live.