Month 2 of running 10+ AI agents for PM work, my Claude bill was 3x what I expected. No clear reason - every agent was "working."
Turned out I had no budget constraints in any of them. Each one had free rein on tokens and scope. Made sense at the time.
Now every agent gets a constraints doc before I deploy: max tokens per run, and which decisions stay with me. Writing "what this agent can't do" takes 3-4 iterations. It's genuinely harder than "what it does."
Two things improved: bill down ~40%, and I stopped getting silent completions - agents succeeding at tasks that weren't quite their job. Those are the worst to catch.
Next step is making the constraints doc the first required artifact in my deployment flow, not an afterthought.
Do you write anything like this before you hand work to an agent? Curious if there's a cleaner format than what I'm using.