I just got back from STEP 2026 in Dubai. Whilst there were some genuinely amazing businesses there, I also saw a lot of companies that won’t make their first year.
Most startups now splash AI on to all their marketing. AI is not your product. AI itself does not deliver business value. Unless you are a frontier lab, AI is nothing more than a tool in your stack. Nobody is there shouting ‘MongoDB-enabled trading platform’.
Users don’t care if it’s AI. Investors don’t care if it’s AI. They care about what it does, what problem it solves and whether there’s space for it in the market.
And if you want to sell to real businesses? I've sat across the table from $5bn consultancies evaluating AI tools. They ask about your architecture, your data residency, how to deploy it on-prem and what you actually own. If the answer is 'we call the OpenAI API' – the meeting is over.
Wrappers… Everywhere
There are tens of thousands of AI startups right now whose core premise is:
This is not a business. Your users could most likely just use ChatGPT – why would they want another subscription?
It’s not defensible. There’s no IP there. There’s nothing unique. On the contrary your whole business is at risk of changes to a model.
Remember when everyone built apps on top of Twitter and then they changed API rules overnight? That can happen to you if you’re just wrapping a model. It’s even worse here as the frontier models have incentive to compete against you when you come up with a good, simple idea.
Let’s not even get into the fact that you’re open to a huge cost base where you aren’t in control of input or output tokens and just rack up an AI bill behind the scenes.
The playbook right now seems to be:
You’re doing market research for OpenAI – and they can execute better than you can.
Stop doing this.
Vibe Coding Is Making This Worse
My most successful summary of Brunelly (https://go.brunelly.com/indiehackers) at STEP 2026 was ‘You know what vibe coding is right? We’re the opposite of that. We actually create real-world enterprise quality software’.
That has to be the opener because vibe coding has got such a bad reputation in the real-world. Security gaps, bugs, scalability, deployments, infrastructure management, compliance – all non-existent.
And vibe coded AI products take the worst of all worlds. The simplest AI wrapper around some basic CRUD operations but lacking any scalability.
Please stop.
There’s A Better Way To Do AI
I’ve spent the last year building Maitento – our AI native operating system. Think of it as a cross between Unix and AWS but AI native. Models are drivers. There are different process types (Linux containers, AI’s interacting with each other, apps developed in our own programming language, code generation orchestration). Every agent can connect to any OpenAPI or MCP server out there. Applications are defined declaratively. Shell. RAG. Memory system. Context management. Multi-modal. There’s a lot.
This is the iceberg we needed to create a real enterprise-ready AI-enabled application.
Why did we need it? Extensibility. Quality. Scalability. Performance. Speed of development. Duct-taping a bunch of Python scripts together didn’t cut it.
I’m not saying you need the level of orchestration that we have – but wanted to emphasise that the moving pieces in enterprise grade AI orchestration are far more complex.
Do you think ChatGPT is just a wrapper around their own API with some system prompts? There’s file management, prompt injection detection, context analysis, memory management, rolling context windows, deployments, scalability, backend queueing, real-time streaming across millions of users, multi-modal input, distributed Python execution environments. ChatGPT itself has a ‘call the model’ step but it’s the tiniest part of the overall infrastructure.
The Uncomfortable Truth
It’s easy to call an API. It’s far harder to build real infrastructure than many founders realise.
Founders want to ship so rush to deliver. But that doesn’t mean you’re actually building a business – you’re building a tech demo.
A demo is not a product. It’s a controlled environment that doesn’t replicate reality.
The gap between impressive demo and production-grade product in AI is wider than in any other category of software. Because AI systems fail in ways that traditional software doesn't. They hallucinate, they lose context, they confidently produce wrong outputs.
Managing that failure mode requires infrastructure. Real infrastructure. Not a try/catch block around an API call.
Build Something That Matters
The AI gold rush is producing a lot of shovels.
Most of those shovels are made of cardboard.
The companies that will still exist in five years are the ones building real infrastructure today. Not just calling APIs. Not chaining prompts. Not wrapping someone else's intelligence in a pretty interface and calling it innovation.
Build the thing that's hard to build. That's the only strategy that works. It always has been.
If you were able to build it in a few days, so can anyone else.
If it’s difficult for you then it is for your competitors.
And then you may actually have a genuinely novel business.
Love the clarity here, especially when you say “Build the thing that's hard to build. That's the only strategy that works. It always has been”. What surprised you most after shipping — acquisition, activation ?
The Twitter API analogy really nails it. I watched a friend build an entire business on top of Twilio's SMS API back in the day, and when pricing changed overnight his margins evaporated. Same energy here with AI wrappers.
What I find interesting though is the middle ground nobody talks about. There's a huge space between "just calling OpenAI" and "building your own OS from scratch." I've been working on dev tools for a while now, and the most defensible stuff I've seen is when teams build really opinionated workflows around a specific domain — like, the AI call is maybe 10% of the code, but the other 90% is gnarly business logic that took months of user interviews to get right.
The vibe coding point hits different too. I've reviewed PRs from vibe-coded projects and the security holes are... creative, let's say. No input validation, hardcoded secrets, SQL injection vectors everywhere. It's fine for prototyping but shipping that to production is genuinely dangerous.
Honest question though — do you think there's a timeline where the "wrapper" label stops being useful? At some point every SaaS product is a wrapper around postgres and stripe. The distinction might be less about what you're wrapping and more about how much domain knowledge is baked into the product.
The distinction I'd make: AI as the foundation vs AI as a feature. If your pitch is 'we use AI', that's the implementation. If your pitch is 'we cut your support tickets by 80% and happen to use AI', that's a business. Founders who confuse their tech stack with their value prop usually don't last. But there are real businesses being built on AI — the ones that start with a specific problem, not with the technology.
Great reality check. I've been wrestling with this exact concept.
Chatbot wrappers are actively ruining how students learn complex algorithms because they just spit out the answer. I'm working on a completely chat-less, proactive AI mentor that runs on a background event loop, triggering only when the student makes a specific logical error (idle time, specific compilation errors, AST logic detection).
The value is in the domain-specific workflow, rather than the model itself. Curious to hear your take—do you see event-driven, invisible AI as a stronger moat than the standard chat UI?
Basically saying an AI side project isn’t automatically a legit business.
yeah—no one is shouting “MongoDB‑enabled trading platform” because that phrase is pure inside-game. Humans buy outcomes (“trade faster”, “don’t lose money”, “compliance-ready”), not implementation details.
Will AI one day be a thing of the past? Probably yes... As my grandma say
Agreed. AI can do lots of things, but not everything. I've also built a tool, but AI couldn't solve so many problems until so much human effort was put into train the AI and testing it.
Absolutely — AI becomes truly effective only with strong human-driven data curation, training, and rigorous testing. I specialize in building and optimizing AI pipelines (ComfyUI workflows, LoRA training, and agentic automation), ensuring reliable and scalable results. If you’re open, I’d love to collaborate or help refine your tool to solve those remaining challenges.
Absolutely! We believe in human involvement, as AI is only as smart as we educate it. It misses context, and that's where the importance of humans come in. When developing Brunelly, my thought was to write for humans first, structure for AI second. AI is the tool, but my main goal is and always will be to aid developers
I agree with this. AI is not the product, it's just an adduct to make the product easier to use.
Exactly that! The amount of companies at Step that labelled themselves as AI was ridiculous. - it's plainly false advertising. They set themselves up for disappointing potential users and clients, as they were advertising AI as their product (which is impossible to do)
Totally agree — many “AI-first” claims are just rebranded automation without real model training, evaluation, or human-in-the-loop systems behind them. That gap is exactly where strong engineering and testing make the difference. I work on building reliable AI workflows (LoRA training, ComfyUI pipelines, and agent-based systems) that actually deliver measurable outcomes. If you’re refining your tool or want to push it beyond basic automation, I’d be glad to collaborate.
This hits hard, but it’s true. I’ve built a few AI features myself, and the real work wasn’t calling the model — it was everything around it: handling bad outputs, making it reliable, and fitting it into a real workflow.
What I’ve learned is simple: users don’t pay for AI, they pay for a problem being solved reliably. The AI is just one small part.
The builders who focus on ownership, workflow, and trust — not just the wrapper — are the ones who’ll still be here in a few years.
Exactly this, and you’ve articulated the bit that usually only clicks after someone’s been burned a few times.
Calling the model is the easy, almost irrelevant part. That’s the demo. The real work starts the first time it produces garbage at 2am and you realise there’s no such thing as “just one more prompt tweak.”
We hit the same wall early on. The question stopped being “how good is the model?” and became “what happens when it’s wrong, uncertain, slow, expensive, or confidently hallucinating?” That’s where most wrappers fall apart because there’s no ownership of failure, no workflow awareness, and no way to recover without a human stepping in.
You’re dead right on the money part too. Nobody pays for “AI.” They pay for outcomes they can trust, repeatedly, inside a workflow that doesn’t fight them. Trust is earned through constraints, guardrails, explainability, and boring-but-critical infrastructure. Not clever prompts.
The uncomfortable truth is that if your product only exists because a model behaves today, you don’t really own anything. And the moment that model changes, so does your business.
The teams still standing in a few years will be the ones who treated AI like a liability to be managed, not a magic trick to be demoed.
The postmortem data backs this up completely.
I've spent the last few months going through 100+ startup failure postmortems, and one of the clearest patterns is what I call 'feature-as-product' failure — building a standalone product around something that's destined to become a default feature of a bigger platform.
The Twitter API cautionary tale you mentioned is a perfect example. Same thing happened to:
The question that kills bad AI wrapper ideas in 30 seconds: 'Would this feature make sense in OpenAI's roadmap?'
If yes — you're building their R&D, not your company.
The founders who actually build defensible AI businesses are doing one of three things: (1) sitting on proprietary data no one else has, (2) building deep workflow integration that makes switching costs high, or (3) serving a regulated/compliance-heavy market where frontier labs won't go.
Everything else is a race to the bottom against people with more compute and better distribution.
Good post — this needs to be said more.
This is a great articulation of it. “Feature-as-product” is exactly the failure mode, and once you see it you can’t unsee it.
The OpenAI-roadmap test is brutal but fair. We run a similar thought experiment internally: if the model provider shipped this natively tomorrow, what would actually break? If the honest answer is “our landing page,” you’re in trouble.
What bites founders is that these products do show early traction. Of course they do, they remove friction for a moment. But that traction is misleading because it’s borrowed, not owned. You’re riding someone else’s capability curve, and they control the slope.
The three buckets you called out are spot on, and I’d add a nasty footnote: even proprietary data isn’t enough unless it’s structurally embedded in the workflow. A CSV in S3 isn’t a moat. A system that continuously compounds data because users rely on it day-to-day is.
The regulated angle is also under-appreciated. Frontier labs optimize for breadth and speed; they actively avoid the slow, ugly constraints of compliance, deployment models, and accountability. That’s where real businesses get built but it’s also where demos go to die.
Most people are accidentally building features because features are fun and shippable. Infrastructure, workflows, and failure handling are boring, expensive, and hard to explain on Twitter, which is precisely why they’re defensible.
Appreciate the comment. This kind of pattern-spotting is what saves founders years of building something that was always going to be absorbed.
I think a lot of founders confuse feature velocity with business defensibility.
Calling an API isn’t the hard part building reliable systems around it is.
I work mostly with early-stage SaaS teams and the biggest gap I see isn’t model quality, it’s operational integration and clear value communication.
Curious, do you think smaller startups should focus on niche workflow depth instead of infrastructure breadth?
Absolutely! I’d go one step further: for smaller startups, infrastructure breadth is usually a trap.
Early teams don’t win by out-building the platforms. They win by out-understanding a very specific workflow and owning it end to end.
A few hard-earned observations from our side:
Niche workflow depth beats generic infrastructure every time. If you deeply understand one painful, repeatable workflow, you can build opinionated systems that feel “obvious” to users. That creates trust. Broad infrastructure without that context just becomes a thin abstraction layer.
Infrastructure should emerge from pain, not ambition. Most founders try to preemptively build “platforms.” In reality, the right infra shows up when your workflow keeps breaking in the same places. That’s when it’s worth hardening.
Operational integration is the product. Users don’t buy models or features. They buy fewer decisions, fewer handoffs, and fewer failure modes. If your product removes steps they hate or eliminates classes of mistakes, you’re already ahead of 90% of AI tools.
Clear value communication follows depth. When you’re deep in a niche, your messaging gets sharper because it’s grounded in lived problems, not abstract capability. “We handle this mess so you don’t” beats “we’re an AI-powered platform” every time.
So yes; start narrow, go deep, and earn the right to generalize later. Infrastructure breadth only makes sense once you’ve proven there’s something worth scaling. Until then, it’s just expensive confidence.
Great question, this is exactly the trade-off more founders should be wrestling with.
I agree that “AI” has become a buzzword and is often used without delivering real value.
I’m currently building a tool that connects to Google Search Console and SERP data, uses LLMs to identify ranking issues, and then automatically generates fixes, even creating GitHub PRs with the proposed changes.
So what you think this AI Product Is Not A Real Business ?
What’s your honest take on this?
Great question, and I’ll give you the honest, non-marketing answer.
What you described can be a real business, but it very easily slips into “feature-as-product” territory if you’re not careful.
Connecting to Search Console + SERP data, analysing issues, generating fixes, and opening PRs is genuinely useful. That’s not nothing. The question isn’t “is there AI involved?” it’s where the defensibility and trust live.
A few technical litmus tests I’d apply:
Are you solving a workflow end-to-end, or just automating a clever step?
If your product owns the full loop — diagnosis → prioritisation → execution → validation → rollback — you’re building a system.
If you’re mostly “spot issue → generate patch → PR”, you’re closer to a feature that a bigger platform will absorb.
Who is accountable when the AI is wrong?
SEO changes can tank traffic just as easily as they can improve it. If the answer is “the user reviews the PR and hopes for the best”, that’s fragile.
If your system can explain why it made a change, estimate impact, detect regressions, and learn from outcomes. Now you’re building trust, not just automation.
What do you own that Google / OpenAI / GitHub don’t?
If your value disappears the moment Google adds “AI suggestions” to Search Console, that’s a warning sign.
Defensibility might be:
The strongest products hide the AI almost entirely. Users pay for predictable improvements, not clever generation.
If your pitch is “we use LLMs to…”, you’re already on thin ice. If it’s “we reliably prevent SEO regressions and surface the highest-leverage fixes”, that’s a business.
So no, I wouldn’t automatically call what you’re building “not a real business”.
But I would say this: the difference between a business and a demo is whether you’re building guardrails, accountability, and learning loops or just output.
Most AI products fail because they stop at generation.
The ones that survive take responsibility for outcomes.
If you’re doing the latter, you’re on the right side of this argument.
Building something truly useful doesn’t seem easy. Too many products are repetitive.
This is one of the most honest and necessary takes on AI startups right now.
So many founders are just wrapping APIs, chasing trends, and calling it a product—without infrastructure, domain value, or real defensibility.
Users don’t pay for AI; they pay for solved problems.
Building something that lasts means going beyond the shiny demo and engineering real reliability, scale, and ownership.
Great read and a much-needed wake-up call.
Well said. What’s interesting is that this isn’t just an AI problem, it’s a systems thinking problem.
Many products are feature driven instead of architecture driven. And when AI fails (which it inevitably does), the surrounding system determines whether the user trusts it or abandons it.
The infrastructure layer is where the real business value seems to be forming.
Totally agree.
Feature-driven products can win demos, but only architecture-driven systems can win long-term trust, especially when AI behaves unpredictably.
The infrastructure isn’t just supporting the model—it is the product. That’s where reliability, safety, and defensibility really live.
I think this is one of the most important discussions in AI right now.
Calling an API is easy. Building something resilient, scalable and defensible is not.
But I also wonder — for early-stage founders, isn’t the wrapper phase sometimes a way to validate demand before investing in deep infrastructure?
At what point do you think a product crosses the line from “wrapper” to “real business”?
Strong take. The only “AI products” that survive are the ones where the model is a component, not the value.
We’ve found the defensibility is in infrastructure + artifacts: audit trails, reproducible runs, verification gates, safe patch/rollback, on-prem friendliness, and outputs users can hand to other humans (reports, diffs, checklists). The model is just the engine.
The wrapper era ends fast; the “reliable AI systems” era is the real opportunity.
the argument is solid but I'd push back on one implicit assumption: that the only viable path is building infrastructure from scratch.
the real line isn't "wrapper vs real business" — it's "do you own something that compounds?" that could be infrastructure, but it could also be a proprietary dataset, a trained fine-tune, a distribution moat, or domain-specific logic that took months of iteration to get right.
the deeper issue with most wrappers isn't the tech stack — it's that there's no accumulation. every user session starts from scratch, every output is ephemeral, nothing gets smarter over time. that's what makes them vulnerable to model providers shipping the same feature in a product update.
"if someone can replicate your product by spending a weekend with the same API, you don't have a business" — that's the actual test. infrastructure is one way to fail that test. not the only one.
the wrapper problem is real but i think the nuance is that some wrappers do solve real workflow problems - the key is whether you own the data layer and domain logic. building an AI video pipeline right now and the actual value isnt the model calls, its the orchestration - batching images for character consistency, mixing still frames with selective animation, managing costs across multiple API providers. the model is maybe 5% of the codebase. if someone can replicate your product by spending a weekend with the same API, you dont have a business. if they need months of domain-specific iteration to match your output quality, you might.
Hard agree on the wrapper critique but I think the answer doesn't have to be "build an OS". The middle ground is domain expertise encoded into infrastructure.
I build tools for accountants and bookkeepers. The pattern matching, data normalisation, platform integration plumbing, and crowd-sourced knowledge base are 95% of the work. There's a thin layer of ML for edge cases but the business value comes from understanding how bookkeepers actually work - not from calling an API.
The thing that makes it defensible isn't complexity for its own sake, it's accumulated domain knowledge that took months of talking to real users and processing real data to build. No wrapper can replicate that because the hard part was never the AI call - it was figuring out what to do with the output.
Quite impressive ,you have highlighted what many do not care about. Thank you.