What changed after giving every customer their own AI agent

by Vishnu K

We expected “one AI agent per customer” to be expensive and messy. It actually removed more problems than it created.

A while ago we built Agent37cloud where every customer gets their own AI agent through a single API call. That was mostly the initial design idea.

After running it with real users, a few things stood out.

First, isolation made things simpler than we expected. We stopped dealing with shared context bugs, state conflicts and users accidentally affecting each other. Each agent just runs in its own space with its own memory and tools. A lot of edge cases disappeared right away.

Second, usage patterns were not even at all. Some agents get used heavily, most are barely touched. That made us rethink how we think about compute. It’s not really “per customer cost,” it’s “per activity cost.”

Third, developers don’t really care about agents as a concept. They care if it behaves like a stable backend worker they can trust through an API. The abstraction only matters if it stays invisible.

The biggest shift was mental. We stopped thinking in terms of chatbots or features and started thinking in terms of per customer execution environments that happen to be intelligent. That made scaling, debugging, and even reasoning about the system feel more straightforward.

Feels like the real problem isn’t giving each customer an agent. It’s trying to share too much in one system.

Curious how others are approaching this. Are you going shared AI layers or isolating per customer by default?

Vishnu K

on June 25, 2026

Say something nice to an_engineer_log…

Post Comment

1

The point about developers not caring about agents as a concept feels right. They care whether it behaves like boring infrastructure they can trust. I see the same thing with DictaFlow. Nobody really buys AI dictation as an idea, they buy text showing up where the cursor is without breaking their workflow. The per activity cost framing is useful too, because the heavy users are the ones who tell you whether the system is reliable enough to become part of daily work.

ryanshrott

·
2 hours ago
·
Reply
1

This is a really solid insight, especially the part about isolation reducing complexity. It sounds counterintuitive at first, but once you’ve dealt with shared state and context issues, it makes a lot of sense.

The shift from “per customer cost” to “per activity cost” is also important. A lot of systems are designed with uniform usage in mind, but real usage is always uneven.

I also agree that most developers don’t actually care about agents as a concept. What they want is something reliable that behaves like a stable backend service.

Personally, I lean toward isolating by default and only sharing at the infrastructure level, like models or embeddings. Shared application-level context tends to create more problems over time.

The only part I’d be curious about is how you’re handling latency and cold starts, since that’s usually the tradeoff with this kind of architecture.

Overall, this feels like a very natural direction for AI systems.

quill_ai

·
2 hours ago
·
Reply
1

This is a really interesting takeaway — especially the part about isolation reducing complexity instead of adding it.

A lot of people assume shared context = efficiency, but in practice it usually turns into hidden coupling and weird edge cases. What you’re describing sounds closer to how we treat infra (containers, VMs) than how people typically think about AI.

The “per activity cost vs per customer cost” point also feels underrated. Most systems are over-optimized for equal distribution, but real usage is always spiky.

I also agree with the abstraction point — devs don’t care if it’s an “agent,” they care if it behaves predictably like a reliable service.

Feels like you’re leaning toward something like:
→ isolated execution by default
→ shared layers only where absolutely necessary (models, maybe embeddings)

Curious how you’re handling cold starts / latency though. That’s usually where per-customer isolation gets tricky.

quill_ai

·
2 hours ago
·
Reply
2

This was a great read. I think personalized AI experiences are becoming a competitive advantage. It'd be interesting to see how this scales as your customer base grows.

ux_kena

·
21 hours ago
·
Reply
1. 1
  
  Thanks! That's something we're paying close attention to as well. One thing we've learned so far is that isolation actually makes scaling easier to reason about because each customer's agent behaves independently instead of contributing to a shared pool of state. I'm sure new challenges will show up as we grow, but so far the operational simplicity has outweighed the added infrastructure complexity.
  
  an_engineer_log
  
  ·
  12 hours ago
  ·
  Reply
2

The usage distribution insight is interesting. It feels a lot like cloud infrastructure where a small percentage of users generate most of the load. Did that change how you think about pricing as well?

tryandbuild

·
a day ago
·
Reply
1. 1
  
  Definitely. We started out thinking "cost per customer," but real usage looked much more like cloud workloads where a small percentage of users generate most of the activity. Once we looked at it that way, optimizing around activity patterns made a lot more sense than optimizing around the number of agents.
  
  an_engineer_log
  
  ·
  a day ago
  ·
  Reply
  1. 2
    
    That makes a lot of sense. It sounds like the agent itself becomes a pretty lightweight abstraction, while the real challenge shifts to managing bursts of activity efficiently.
    
    tryandbuild
    
    ·
    a day ago
    ·
    Reply
1

the mental model shift is the real finding here. once you stop thinking chatbot and start thinking execution environment, everything else gets cleaner imo.

growthbylenny

·
4 hours ago
·
Reply
1

Interesting perspective. I think per-customer isolation becomes even more valuable as AI applications move into production. It not only reduces shared-state issues but also improves security, debugging, and customer trust. The real challenge seems to be balancing strong isolation with efficient resource utilization, especially when usage patterns are highly uneven. It'll be interesting to see how architectures evolve to optimize both reliability and cost without sacrificing either.

johnsmith0987

·
5 hours ago
·
Reply
1

Isolation of state is the part everyone's celebrating here, but it can hide a blast radius people forget about. Each agent's memory and execution are separate, but they almost certainly share upstream dependencies: the same external APIs, the same data sources, the same tools. The day one of those drifts or rate-limits, every isolated agent fails at once, and the per-tenant isolation gives you no protection at all. So the "stable backend worker" promise isn't really about the agent's own state. It's about how stable the things it reaches out to are. Curious whether you isolate at the dependency layer too, or if that's where the next class of "why did every agent get weird at 2pm" bugs is hiding.

ori_marti

·
5 hours ago
·
Reply
1

The distribution shifted. When customers own the agent, retention flips —
they're not using your product, they're using their tool now. How did usage patterns
change between "feature you built" and "agent they control"?

Sandy_0517

·
9 hours ago
·
Reply
1

You said developers only care if it behaves like a stable backend worker they can trust through an API, but isolated per-customer agents with their own memory inherently behave less predictably over time than a stateless worker, since each one's "personality" drifts based on its own history. How do you reconcile "feels like a stable backend worker" with an architecture where every instance is, by design, diverging from every other instance?

adin_builds

·
10 hours ago
·
Reply
1

The activity-based economics point is something I've run into building a fintech comparison product — usage patterns are wildly uneven and modeling costs per-customer rather than per-activity inflates your estimates significantly.

The isolation-vs-shared-learning tradeoff is the real architectural question here. Isolating execution and memory while pooling sanitized learning signals seems like the right middle ground, but curious how you're handling the cold start for genuinely new customers who have no prior activity to learn from. Do you seed their agent state with anything, or start fully blank?

oghabayen

·
11 hours ago
·
Reply
1

The isolation insight maps to something broader. Shared state is where unpredictable behavior compounds. Per-customer isolation is not just cleaner architecture. It is also how you build something customers can actually trust.

"Stable backend worker" is the right frame. The moment the abstraction becomes visible, you have a support ticket.

Alex_Iliescu

·
14 hours ago
·
Reply
1. 1
  
  Exactly. We found that isolation wasn't just an implementation detail—it simplified debugging and made the system much easier to reason about. And I like your point about the abstraction. If developers have to think about the agent instead of just trusting the API, something has already gone wrong. The best AI infrastructure is the kind that quietly does its job.
  
  an_engineer_log
  
  ·
  12 hours ago
  ·
  Reply
  1. 1
    
    The audit trail question is the right one. Isolation gives you the visibility, but only if you log what the agent saw at decision time, not just what it did. Most implementations skip that and end up with the same black box problem, just scoped per customer instead of shared.
    
    Alex_Iliescu
    
    ·
    5 hours ago
    ·
    Reply
1

the uneven usage pattern tracks with something I see in AI skills data too. within a single company the gap between your best and worst AI users is massive, same tools available to everyone. some people intuitively figure out how to get value from the agent, others barely touch it. per-customer agents solve the infra isolation problem but there's a whole user competency side that most teams just ignore completely

Ozzie

·
14 hours ago
·
Reply
1. 1
  
  That's a really interesting point. We saw the infrastructure side of that pattern, but there's definitely a human side as well. Even with the same capabilities available, usage varies a lot between customers.
  I think those two things are connected. Per-customer agents give you clean isolation and visibility, which also makes it easier to understand how people are actually using the system and where they're getting stuck. Improving adoption becomes much easier when you can observe each agent independently instead of everything being mixed together.
  
  an_engineer_log
  
  ·
  12 hours ago
  ·
  Reply
1

The isolation framing maps to what I am trying to test in agent-accountability tooling. Per-customer agents solve shared-state bleed, but they also make the evidence problem more visible: when agent A does something weird, can you reconstruct what it saw, which tools/memory it had, and why the action was allowed without trusting its summary?

I would lean isolated execution by default, with shared learning only through sanitized patterns or receipts. Curious whether Agent37cloud keeps an audit trail per agent, or mostly relies on app logs after the fact.

zaindanaharper

·
15 hours ago
·
Reply
1

The isolation point is what caught my attention most.

We ran into the opposite problem initially — shared context across users. Every person's interaction influenced the next person's experience. Debugging a bad response meant untangling whose data leaked where.

Moving to per-tenant isolation killed a whole category of bugs overnight. But it introduced a new one: cold starts. Agent A knows user A well but has no idea what agent B learned. That shared learning is where the real value compounds, and isolation throws it away.

What we found was a middle ground: isolate the execution environment, but pool the learning signals. One agent's successful pattern becomes a suggestion for new agents, without exposing the data.

Curious if you hit the cold-start problem or if the per-activity cost model handled it naturally.

paradox07

·
18 hours ago
·
Reply
1

The isolation point matches what I have seen too. Per-tenant agents kill a whole class of shared-state bugs that are painful to debug otherwise. The uneven usage is the tricky part. We ended up treating idle agents as cold and only keeping memory and tools warm for the active ones, otherwise compute creeps up fast. Did you keep each agent's memory fully separate, or share a base layer (embeddings, tool configs) and isolate only the conversation state? That split saved us a lot without breaking isolation.

ahmet_ozel

·
19 hours ago
·
Reply
1

The "isolation removed edge cases" point resonates a lot. We hit the same thing at the DB layer building Kumiko — shared-schema multi-tenancy where every query auto-injects tenantId has the same effect: a whole class of "tenant A sees tenant B data" bugs just disappears structurally.

To your question: we default to isolation everywhere possible, even when it feels over-engineered upfront. The hidden cost of shared state (debugging, support tickets, audit complexity) always outweighs the setup cost.

Wrote about the DB side of this last week if curious: https://dev.to/marc_kumiko/multi-tenancy-in-bunhono-without-boilerplate-2kgk

marc_kumiko123

·
20 hours ago
·
Reply
1

I really like the point that users care more about reliability than whether it's called an "AI agent." The abstraction only matters if it solves a real problem.

It also makes me wonder: as AI products mature, do you think privacy and isolation will become bigger differentiators than model quality itself?

It feels like users are starting to ask "Where does my data go?" before they ask "Which model are you using?"

kuberagent123

·
20 hours ago
·
Reply
1. 1
  
  I think reliability, privacy, and isolation will become much bigger differentiators over time, especially for business use cases. Model quality still matters, but once models reach a similar baseline, people start evaluating the operational side instead.
  
  If developers can't trust where data goes or whether one customer's state can affect another, it doesn't matter how capable the model is. Trust ends up becoming part of the product, not just the infrastructure.
  
  an_engineer_log
  
  ·
  12 hours ago
  ·
  Reply
1

The "per activity post" point stood out to me. It feels similar to how people obsess over signups when activation is the metric that actually matters. We tend to optimize for the thing that's easiest to count instead of the thing that predicts success.

boothkeepos

·
a day ago
·
Reply
1. 1
  
  I like that analogy. We started out thinking the number of agents would be the key metric, but it turned out that activity told us much more about both system behavior and resource needs. It's easy to optimize for what you can count. The more useful metric is usually the one that reflects how the system is actually being used.
  
  an_engineer_log
  
  ·
  12 hours ago
  ·
  Reply
1

The line that stood out to me was that developers don't really care about agents—they care about predictable behavior.

It feels like a lot of AI infrastructure wins happen when the AI disappears into a reliable abstraction, rather than becoming the thing developers have to think about every day.

aryan_sinh

·
a day ago
·
Reply
1. 1
  
  That's been one of the biggest lessons for us. Early conversations were all about agents, autonomy, and capabilities, but most developers ultimately judge the system on reliability and predictability. If they can make an API call and trust the result, the abstraction is doing its job. The more they have to think about the AI itself, the more likely it is that something in the underlying system isn't stable enough yet.
  
  an_engineer_log
  
  ·
  a day ago
  ·
  Reply
  1. 1
    
    That's exactly why I found the thread interesting.
    
    I think there's one decision underneath that observation that's easier to miss than it first appears.
    
    Happy to explain the thought properly over email if it's useful. What's the best address to reach you on?
    
    aryan_sinh
    
    ·
    a day ago
    ·
    Reply
1

The 'per-activity cost, not per-customer cost' line is the one with money attached. If most agents sit idle and a few run hot, flat per-seat pricing will quietly bleed you, so I'd price on activity and make idle agents cost near zero (scale-to-zero, cold start on demand). On isolation, I'd default to per-customer every time: the blast radius from one tenant's state leaking into another is a trust problem you can't undo, and trust is the whole product for a backend worker.

GregoryScottHenson

·
a day ago
·
Reply
1. 1
  
  I agree. The cost side ended up being much more about activity distribution than agent count. Once we saw how uneven usage was, it became clear that idle agents and active agents shouldn't be treated the same way operationally. And on isolation, that's exactly the conclusion we kept coming back to. Compute can be optimized, but trust is much harder to rebuild once it's lost. When developers are treating an agent like a backend worker, knowing that its state, memory and execution are fully separated becomes a pretty important guarantee.
  
  an_engineer_log
  
  ·
  a day ago
  ·
  Reply
1

We have seen something similar with multi tenant systems in general. Shared resources look efficient at first but debugging weird cross customer issues can end up costing more than the savings.

Indietechi

·
a day ago
·
Reply
1. 1
  
  Exactly. On paper, sharing resources looks more efficient, but once you start chasing state leaks, context contamination, or customer-specific edge cases, the operational cost adds up quickly. One thing we noticed is that isolation makes failures much easier to reason about. When something goes wrong, you're looking at a single customer's environment instead of trying to untangle interactions across the entire system.
  
  an_engineer_log
  
  ·
  a day ago
  ·
  Reply
1

I like the framing of "per customer execution environments that happend to be intelligent". That's a very different mental model from the chatbot first approach most people seem to be taking right now.

shipstack2016

·
a day ago
·
Reply
1. 1
  
  That shift took us a while to realize as well. Early on, it's easy to think in terms of chat interfaces and conversations, but once you start operating these systems at scale, they start looking a lot more like dedicated services than chatbots. The intelligence becomes just one component. Isolation, state management, reliability, and observability end up being equally important parts of the design.
  
  an_engineer_log
  
  ·
  a day ago
  ·
  Reply