1
1 Comment

I Thought AI Made Me Faster. My Metrics Disagreed.

Friday, 4:47 PM.
A PR lands in the repo with a clean summary, tidy diff, and an AI review comment that might as well read:

“Ship it.”

I skim. I nod. I merge.

Monday, 10:12 AM.
A teammate pings: “Why are we making three API calls per page view now?”

It worked. Tests passed. It looked correct.

It also quietly doubled latency and introduced a failure mode that only showed up under real traffic.

That’s when I stopped asking:
“Does AI make me faster?”

…and started asking the only question that matters:
“Does AI reduce time from idea → safely in production?”

Because “faster” is easy to feel.
“Productive” is something you have to measure.


The AI productivity mirage (and the hidden tax)

AI makes code appear instantly, so your brain says: we’re flying.
But in real codebases, the work often shifts from writing → verifying.
That verification tax looks like:
• rereading more carefully because you don’t fully trust the output
• extra prompts to “make it match our patterns”
• more test runs because something feels off
• cleanup commits because the diff is bigger than it needed to be

So yes, you typed less.
But you didn’t necessarily ship sooner.


What I measure now (so I stop lying to myself)

If you only measure “how quickly I produced code,” AI wins every time.
Shipping isn’t typing. Shipping is finishing.
Here’s the tiny metrics set that tells the truth:

  1. Lead time: ticket start → deployed
  2. Rework time: time spent fixing AI output after the “first draft”
  3. Defect escape rate: bugs found after merge (especially within 7 days)
  4. Review burden: how many human minutes it took to verify the change

One hard lesson:
AI can make one dev feel faster by making everyone else slower.


AI agents in code review: useful, but don’t give them authority

I used to treat AI review like a senior engineer.
That was my mistake.

Think of a review agent as a junior dev with:
• infinite confidence
• great pattern-matching
• occasional invented assumptions

Where review agents shine
• pointing out missing null checks / edge cases
• spotting inconsistent patterns
• suggesting tests you forgot
• surfacing obvious security foot-guns
• summarizing the diff (this alone saves time)

Where they’re dangerous
• deep domain logic (“is this billing rule correct?”)
• performance reality (N+1s, caching, query behavior)
• security boundaries (authz, tokens, tenant isolation)
• architecture (“does this belong here?”)

My rule now:
Agents don’t approve PRs. Agents do chores.


“Vibe coding” is fine. Shipping vibe code is not.

Vibe coding is great for exploration: “move fast, let the model fill gaps.”
It becomes risky when you treat “looks good” as “is good.”

Here are the guardrails that let me ship fast without shipping chaos:

1) Keep diffs painfully small
If the AI needs 800 lines to solve it, you don’t understand the problem yet.
Small diffs force clarity. Clarity prevents surprise architecture.

2) Require tests that would fail without the change
AI loves happy-path tests that only validate its own assumptions.
Minimum bar:
• at least one test that fails before the change
• at least one edge-case test

3) Force invariants into words
Not “what did you do?” — what must always remain true?
Examples:
• “authz must be checked server-side”
• “billing events must be idempotent”
• “cache keys must include tenant id”
If you can’t state invariants clearly, you’re not ready to merge.

4) Use feature flags when uncertainty exists
Flags are honesty. They buy you learning time without burning trust.


Copy/paste prompts I actually use

Prompt: “Strict review, no approvals”
You are a strict code reviewer. Do NOT approve this PR.
Review the diff and output:

  1. Correctness risks (edge cases, undefined behavior)
  2. Security risks (authz, secrets, injections, data exposure)
  3. Performance risks (N+1, caching, queries)
  4. Maintainability (complexity, naming, structure)
  5. Test gaps (what should exist but doesn’t)

Rules:

  • If unsure, say "UNCERTAIN" and why.
  • Reference specific files/functions.
  • Suggest minimal fixes and minimal tests.

Prompt: “Smallest possible diff”
Make the smallest possible change to implement the requirement.

Output:

  • Unified diff patch only (no commentary).
  • Include/modify tests so the change is covered.

Constraints:

  • Preserve existing architecture.
  • No new dependencies.
  • Prefer existing utilities/patterns.
    These two prompts did more for my workflow than any “agent autopilot.”

The merge checklist (my last line of defense)

Before I merge AI-assisted code, I ask:
• Can I explain the change without “reading the code aloud”?
• What invariant does this rely on?
• What happens on bad input / retries / timeouts?
• Is there a test that would fail if the change didn’t exist?
• Did we widen permissions or expose data?
• What’s the rollback story?
• Would I be happy owning this code in 6 months?
If any answer is “not sure,” it’s not ready.


The point isn’t “AI everywhere”

The point is predictable shipping.
AI is incredible at drafts, scaffolds, summaries, and catching the obvious.
But the moment you give it trust by default, you’re not moving fast.
You’re just moving uncertainty into production.
And production has a way of collecting interest.


Your turn

  1. Where has AI genuinely reduced end-to-end shipping time for you?
  2. What’s your “never let AI touch this” zone (auth, billing, infra…)?
  3. If you use review agents: what’s the one prompt/checklist that made them useful?
posted to Icon for group Developers
Developers
on March 4, 2026
Trending on Indie Hackers
Your AI Product Is Not A Real Business User Avatar 116 comments Stop Building Features: Why 80% of Your Roadmap is a Waste of Time User Avatar 78 comments I built an enterprise AI chatbot platform solo — 6 microservices, 7 channels, and Claude Code as my co-developer User Avatar 38 comments The Clarity Trap: Why “Pretty” Pages Kill Profits (And What To Do Instead) User Avatar 34 comments I got let go, spent 18 months building a productivity app, and now I'm taking it to Kickstarter User Avatar 22 comments I went from 40 support tickets/month to 8 — by stopping the question before it was asked User Avatar 19 comments