3
3 Comments

Stop blaming your prompts. Blame your token budget.

Ever spend years copying text one word at a time — clicking, dragging, missing, trying again — before someone showed you you could just double-click to select the whole word?
That's exactly how I felt after a year of vibe coding with Claude.
I wasn't prompting wrong. I wasn't using the wrong model. I was just... running out of room.
One thing that took me a while to fully internalize: the context window is everything.
Every conversation has a token limit — code, documents, back-and-forth messages, all of it counts toward the same budget. The longer the conversation, the more the model has to "compress" older context to fit. You're not imagining it when responses start feeling more generic or forgetful mid-session — that's a real degradation, not a vibe.
A few signs I've learned to recognize:

Responses get more generic, less tailored to what we've been building
Claude repeats things it already said
Simple code starts having dumb mistakes
It "forgets" something we explicitly covered 20 messages ago

What actually helps:

New conversation for every new topic — no exceptions
Don't paste long code, describe what it does instead
Heavy code sessions: start fresh after ~30–40 messages
Pure text discussions: you can push further
When something feels "off" — just open a new chat. That instinct is usually right.

Been vibe coding for a while? I'd love to hear what's worked for you — and what hasn't.

posted to Icon for group AI Tools
AI Tools
on May 14, 2026
  1. 1

    This is actually great advice. I have swapped over not to long ago using this method.

  2. 1

    Smart approach. I do something similar but keep the summary even leaner, just core architecture decisions and non-obvious constraints, since I've found that even a dense markedown dump can eat into the new session's budget faster than expected.

  3. 1

    This is exactly why context window degradation is the silent killer of complex builds. You are completely right about needing to start fresh, but just opening a new chat means you lose the global architecture.

    My workaround is 'State Summarization'. Around message 25, before the degradation hits, I prompt the model to generate a dense, compressed markdown summary of the current codebase architecture, established rules, and pending tasks. I then use that summary as the system prompt for the new chat. It bypasses the token bloat while keeping the model tightly grounded. How do you handle transferring the necessary context when you spin up those fresh sessions?

Trending on Indie Hackers
6 weeks solo, 2 rejections, finally live but nobody told me marketing would be this hard User Avatar 85 comments Building ExpenseSpy solo, no funding — launching June 17 on iOS & Android User Avatar 38 comments Hi IH — quick update. The MVP is live. User Avatar 34 comments I built a $5/1k-listing CRE data API because CoStar is overkill for first-pass scans User Avatar 18 comments Day 7: 51 people answered my question. I wasn't ready for what they said. User Avatar 18 comments Building LinkCover – Day 3: Payment is live. No more building, time to sell. User Avatar 11 comments