I used to think the expensive moment in AI work was the big prompt.
It usually is not.
The jump often happens at the handoff:
Big context windows make this feel free. It is not free. You pay in tokens, latency, and worse focus.
What changed for me was watching token usage live while I work.
A few things became obvious fast:
My current rule:
Start new tasks with the smallest context that still preserves the decision.
That has been a better cost habit than endlessly tweaking prompts.
I built TokenBar for this because postmortem spend charts tell me what happened after the waste already happened. Live token counting changes behavior in the moment.
If you work with LLMs every day, how are you handling context handoffs between tasks?
tokenbar.site