Friday, 4:47 PM.
A PR lands in the repo with a clean summary, tidy diff, and an AI review comment that might as well read:
“Ship it.”
I skim. I nod. I merge.
Monday, 10:12 AM.
A teammate pings: “Why are we making three API calls per page view now?”
It worked. Tests passed. It looked correct.
It also quietly doubled latency and introduced a failure mode that only showed up under real traffic.
That’s when I stopped asking:
“Does AI make me faster?”
…and started asking the only question that matters:
“Does AI reduce time from idea → safely in production?”
Because “faster” is easy to feel.
“Productive” is something you have to measure.
The AI productivity mirage (and the hidden tax)
AI makes code appear instantly, so your brain says: we’re flying.
But in real codebases, the work often shifts from writing → verifying.
That verification tax looks like:
• rereading more carefully because you don’t fully trust the output
• extra prompts to “make it match our patterns”
• more test runs because something feels off
• cleanup commits because the diff is bigger than it needed to be
So yes, you typed less.
But you didn’t necessarily ship sooner.
What I measure now (so I stop lying to myself)
If you only measure “how quickly I produced code,” AI wins every time.
Shipping isn’t typing. Shipping is finishing.
Here’s the tiny metrics set that tells the truth:
One hard lesson:
AI can make one dev feel faster by making everyone else slower.
AI agents in code review: useful, but don’t give them authority
I used to treat AI review like a senior engineer.
That was my mistake.
Think of a review agent as a junior dev with:
• infinite confidence
• great pattern-matching
• occasional invented assumptions
Where review agents shine
• pointing out missing null checks / edge cases
• spotting inconsistent patterns
• suggesting tests you forgot
• surfacing obvious security foot-guns
• summarizing the diff (this alone saves time)
Where they’re dangerous
• deep domain logic (“is this billing rule correct?”)
• performance reality (N+1s, caching, query behavior)
• security boundaries (authz, tokens, tenant isolation)
• architecture (“does this belong here?”)
My rule now:
Agents don’t approve PRs. Agents do chores.
“Vibe coding” is fine. Shipping vibe code is not.
Vibe coding is great for exploration: “move fast, let the model fill gaps.”
It becomes risky when you treat “looks good” as “is good.”
Here are the guardrails that let me ship fast without shipping chaos:
1) Keep diffs painfully small
If the AI needs 800 lines to solve it, you don’t understand the problem yet.
Small diffs force clarity. Clarity prevents surprise architecture.
2) Require tests that would fail without the change
AI loves happy-path tests that only validate its own assumptions.
Minimum bar:
• at least one test that fails before the change
• at least one edge-case test
3) Force invariants into words
Not “what did you do?” — what must always remain true?
Examples:
• “authz must be checked server-side”
• “billing events must be idempotent”
• “cache keys must include tenant id”
If you can’t state invariants clearly, you’re not ready to merge.
4) Use feature flags when uncertainty exists
Flags are honesty. They buy you learning time without burning trust.
Copy/paste prompts I actually use
Prompt: “Strict review, no approvals”
You are a strict code reviewer. Do NOT approve this PR.
Review the diff and output:
Rules:
Prompt: “Smallest possible diff”
Make the smallest possible change to implement the requirement.
Output:
Constraints:
The merge checklist (my last line of defense)
Before I merge AI-assisted code, I ask:
• Can I explain the change without “reading the code aloud”?
• What invariant does this rely on?
• What happens on bad input / retries / timeouts?
• Is there a test that would fail if the change didn’t exist?
• Did we widen permissions or expose data?
• What’s the rollback story?
• Would I be happy owning this code in 6 months?
If any answer is “not sure,” it’s not ready.
The point isn’t “AI everywhere”
The point is predictable shipping.
AI is incredible at drafts, scaffolds, summaries, and catching the obvious.
But the moment you give it trust by default, you’re not moving fast.
You’re just moving uncertainty into production.
And production has a way of collecting interest.
Your turn
This is a useful list. Are you on LinkedIn?