I've been deep in the AI coding agent space for a while now. I use four tools daily — Claude Code, Codex CLI, Cursor, and GitHub Copilot — and currently I've settled on Claude Code and Codex CLI as my main two because they each excel at different things.
Let me be honest: these agents are no longer dumb. Claude Code does initial discovery — it reads your codebase, understands your architecture, follows AGENTS.md conventions. Codex CLI has its own harness with project-level instructions and skills. They're genuinely good at figuring out code-level context before they start working.
And the ecosystem is evolving fast. Cline's memory bank technique uses structured markdown files to persist project knowledge across sessions. Spec-driven development (SDD) is becoming a real practice — GitHub's spec-kit has 72k+ stars. OpenAI published "Harness Engineering" about building structured documentation as the source of truth for Codex. Anthropic published about effective harnesses for long-running agents.
So what's still missing?
All of these solve code-level context — architecture patterns, coding conventions, build commands, file structure. And they're getting really good at it.
But none of them solve product-level context:
Code tells you what is. Product truth tells you what should be and why.
When your agent reads your codebase, it sees a Stripe integration. It doesn't know you're actively evaluating switching to Polar. It sees a simple auth flow and doesn't know that's intentionally minimal because you haven't decided on the permission model yet. It sees a 24-hour refund queue and doesn't know whether that's a deliberate product decision or a temporary hack.
These aren't coding convention problems. AGENTS.md and memory banks don't solve them. They're product decisions that live in the founder's head — and that's where they stay until someone externalizes them.
What I'm building:
I'm building Stewie — a product intelligence layer that sits on top of your repo and complements your existing coding agent setup:
docs/stewie/ in your repo — per-module behaviors with trust levels (confirmed, unconfirmed, provisional, exploring)It's not replacing AGENTS.md or memory banks or specs. It's the product truth layer that those tools don't cover — sitting alongside them, giving agents the "what should this product do and why" context that code alone can't provide.
The contract format is an open spec — Product Behavior Contract (PBC) — Markdown-first, machine-readable, designed for both humans and agents. You can read it, your agents can parse it.
As you answer more questions, agents get better context. The contract stays in sync with your repo and evolves as your product evolves — without you maintaining docs manually.
Where I am:
Early beta. Solo founder building this because I felt the gap myself. The scan → question → contract → sync loop works. I'm looking for other technical founders who live in AI coding agents daily and want to try a different approach to the context problem.
Free beta is open — no credit card, no commitment. Scan one repo, answer a few questions, see what Stewie finds. I just want honest feedback at this stage.
If you've tried AGENTS.md, memory banks, custom instructions, long prompts — and still feel like your agents are missing the product layer — this might be what's missing.
stewie.sh — or just tell me how you're solving this. Genuinely curious.