Why My Claude Costs Kept Rising (Even When Output Didn’t): 3 Token Leakage Patterns in Agent Workflows

I used to assume rising AI cost meant one of two things:

But after debugging real Claude + agent workflows, I kept seeing a third pattern:

Token leakage — spend goes up, useful output doesn’t.

If you’re building solo products or shipping fast with agents, this is probably happening to you too.

This post is a practical breakdown of the 3 leakage paths I’ve seen most, plus what actually worked to stop them.

This was the biggest silent cost driver.

Typical triggers:

What it looked like in logs:

What reduced spend quickly:

I thought “more context = better quality.”
Sometimes true. Often expensive noise.

Typical triggers:

Signals:

Fixes that worked:

When upstream gets unstable, bad retry logic becomes expensive fast.

Typical triggers:

Signals:

Fast mitigation:

I stopped trying to build a huge dashboard first.
This small schema was enough to find most issues:

With this, I can usually answer:

I used to over-engineer too early. That was a mistake.

Now I follow this order:

Stop loss first
cap retries, reduce context window, disable suspicious triggers
Find top contributors
sort by anomaly contribution, fix biggest 20% first
Turn fixes into rules
dedupe, retry boundaries, session segmentation

This shifted cost management from “monthly surprise” to “same-day containment.”

I’m building around a simple operational loop:

detect → attribute → contain

Not “another pretty dashboard,” but a way to make cost control part of everyday workflow.

If you want to try the exact setup I’m using, here’s the install command for macOS/Linux:

curl -fsSL https://aikeylabs.com/zh/i/ih02 | sh

If you’re testing it, I’d love your feedback on where cost leaks show up first in your workflow (duplicate calls, context bloat, or retry storms).