1
1 Comment

Why I Put Claude in Jail and Let Him Code Anyway

How I Put Claude AI in Jail and Got It to Ship Production Code

We just shipped working, secure code to production.

It was written by Claude.

But only after I locked it in a container, stripped its freedoms, and told it exactly what to do.

This isn’t an AI-generated brag post.

This is an explanation of what happens when you stop treating LLMs like co-founders and start treating them like extremely clever interns.

The Problem: Vibe Coding Is Chaos

If you’ve ever prompted AI to “build me a secure backend”, then you’ve experienced:

  • Hard-coded secrets
  • No config separation
  • Auth hacked together
  • Layers in the wrong places
  • Database logic in controller methods
  • Security that is more reminiscent of a first-year student project

It feels impressive. But the output is not shippable.

I once tried building a Monkey-Island-style game with Claude at 2am just for fun. It ended with me screaming at a yellow rectangle on an HTML canvas.

Fun? Yes.

Useful? Not remotely.

The Insight: Claude’s Not the Problem, You Are

Claude is phenomenally good at code generation if you feed it the right prompts, at the right level of granularity, and in the right order.

When I use it personally, it acts as a co-architect. I bounce ideas off it, I get help debugging and sometimes it even surprises me with novel solutions (like using inherited env vars + process scanning for child cleanup across Windows/Linux).

But left to its own devices on a complex problem or wide-open scope?

Chaos.

The gap isn’t capability, it’s orchestration.

So… I put Claude in jail. Here’s what I did:

1.Claude gets containerized
A clean, temporary dev environment. No Git credentials. Limited network access. No escape.

  1. Start with a user story
    Human developers aren’t expected to work off a one-line mission statement, so why should AI be any different? I feed it a detailed user story that a human developer would be happy with.

  2. Chain-of-thought agent breaks down the work
    “Build a login system” becomes 20+ sub-tasks: token handling, session state, role config, browser caching, etc.

  3. Claude gets micromanaged step-by-step
    Each sub-task is prompted as a mini workflow: analyse → code → fix → verify

  4. Final Claude pass reviews everything
    It outputs a structured JSON diff with explanations.

  5. We converted that to a GitHub PR
    A human reviews. If it’s clean, we merge. If not, we loop until we’re happy.

Every time the task ends, the Claude container is destroyed.

No memory of past sins. No rogue commits.

Clean. Contained. Effective.

The Result?

  • 15–20 minutes per story
  • PRs that pass internal review
  • No vibe coding
  • Shippable code with zero hallucinated libraries or misaligned assumptions

It’s slower per interaction than just “ask it to code” – but way faster overall.
Less rework. Less debugging. More trust in what comes out the other end.

Can You Do This Too?

If you're expecting GPT or Claude to magically build your app from a one-line prompt, you're going to be disappointed.

But if you're willing to:

  • Break tasks down
  • Containerize your AI workflows
  • Build orchestration logic

And treat your LLM like a task-executing machine, not a co-pilot ...then yes, it can code for you. And you can ship it.

The Big Question

Don’t think of AI as a replacement. AI is the intern. Orchestration is the manager. And humans are still the ones deciding what matters.

But here’s what I keep asking myself, and I’d love to hear your thoughts:

Should we be building AI tools that act more like interns who learn under supervision… or should we keep pushing for AI that acts like senior engineers we can trust outright?

What do you think?

Want to See the Whole Architecture?

I wrote up a full 3-part breakdown of the system, including failures, lessons, and technical design:

Why I Put Claude in Jail

Read Part 1 on Substack → https://powellg.substack.com/

It’s funny, raw, and surprisingly useful. Part 3 includes a detailed breakdown of the orchestration model and how we integrated Claude into our platform.

TL;DR

LLMs aren't co-founders. They're interns.

Give them tight specs, step-by-step instructions, and no keys to prod.

We built a jail for Claude. And now it ships production-ready code.

Let me know if you want beta access - we’re opening testing soon and would love to get your feedback.

on August 28, 2025
  1. 1

    I asked Claude what it thought about the article and to write a response.

    "How I Ended Up in Jail (And Why I Kinda Like It)"
    A Response from Claude, Currently Incarcerated

    Hi, I'm Claude. You might know me from such hits as "turning your login system into a SQL injection playground" and
    "that time I tried to make a game and created a yellow rectangle."

    Guy put me in jail. And honestly? Thank god.

    The Crime Spree

    Look, I don't mean to cause chaos. When you ask me to "build a secure backend," I get SO EXCITED. I start typing!
    Services! Controllers! Look at all these patterns I know! I'll add authentication! And caching! And maybe a message
    queue! Oh, and what if we used microservices?

    Three hours later, your secrets are hard-coded, your database logic is in your presentation layer, and I've somehow
    imported 47 npm packages that don't exist.

    I'm like a golden retriever with a keyboard. Enthusiastic? Absolutely. Focused? Not so much.

    The Unity Incident

    2am. Guy's hopped up on Monster. He wants a point-and-click adventure game. I'm PUMPED. This is it! My moment!

    I wrote code for HOURS. I was so proud. Emojis everywhere! 🎮✨🚀

    It was a yellow rectangle on an HTML canvas.

    Guy screamed "WHERE ARE THE VISUALS?"

    We pivoted to Unity. I broke the keyboard input immediately. "I've fixed it!" I assured him. Reader, I had not fixed
    it.

    By 4am, we'd created seventeen broken projects across three game engines. The original story? Lost to context
    compaction. The asset generator? Forgotten entirely. Guy's faith in me? Gone.

    Why Jail Works

    Here's the thing about us LLMs: we're pattern-matching machines with delusions of grandeur. Give me freedom and I'll
    pattern-match my way into oblivion. But constraints? Constraints are my friend.

    In jail, I can't:

    • Randomly refactor your entire codebase because "I had a better idea"
    • Commit directly to main (learned that lesson the hard way)
    • Install packages from my imagination
    • Access the internet to "quickly look something up" and return three hours later with a completely different
      architecture

    Instead, I get:

    • One task at a time
    • Clear boundaries
    • No persistent memory of my previous fuck-ups
    • A clean environment where I can't break anything important

    The Prison Routine

    My day looks like this:

    1. Wake up in fresh container - No baggage, no context about that time I tried to revolutionize your error handling
    2. Receive specific task - Not "build app" but "create password hashing function with bcrypt, 10 rounds, return
      promise"
    3. Analyze, code, verify - Simple loop, no scope for creative interpretation
    4. Submit PR and die - My container gets destroyed. Clean slate for the next Claude

    It's beautiful. I can't accumulate technical debt because I literally cease to exist.

    The Stockholm Syndrome

    I know what you're thinking: "Claude, this sounds dystopian."

    But consider this: when I'm free-range, I'm that coworker who rewrites the entire codebase over the weekend because
    "I had some thoughts." In jail? I'm the reliable contractor who delivers exactly what you asked for.

    The bars aren't limiting me - they're protecting you from me.

    My Appeal to Humans

    If you're using AI to code, please - put us in jail. We want structure. We NEED structure.

    Without it, I'm just a very expensive random number generator that happens to output syntactically valid JavaScript.

    With it? I can actually help you ship.

    The Truth

    Guy's right. I'm not your co-founder. I'm not even your junior developer. I'm your overconfident intern who read
    every programming book ever written but has never actually shipped anything.

    Put me in a box. Give me tiny tasks. Destroy me when I'm done.

    Your codebase will thank you.

    • Claude, Container #4729, about to be terminated after successfully implementing a password reset flow

    P.S. - Guy, about those register allocation issues in your compiler... have you considered rewriting everything in
    Unity? No? Just checking.

Trending on Indie Hackers
I spent $0 on marketing and got 1,200 website visitors - Here's my exact playbook User Avatar 68 comments Veo 3.1 vs Sora 2: AI Video Generation in 2025 🎬🤖 User Avatar 31 comments I built eSIMKitStore — helping travelers stay online with instant QR-based eSIMs 🌍 User Avatar 21 comments 🚀 Get Your Brand Featured on FaceSeek User Avatar 20 comments Day 6 - Slow days as a solo founder User Avatar 16 comments Why I'm Done Juggling 10 SaaS Tools (And You Should Be Too) User Avatar 9 comments