Pocketos got me thinking about what my agents can actually delete

9 seconds. what we changed after pocketos.

hey everyone

so the pocketos thing has been chewing at me all week. for context: a cursor agent on opus 4.6 found a railway api token in a file it had no business reading, then ran a curl that wiped a production volume in nine seconds. railway's volume backups live inside the same volume, so the only restore was a three month old snapshot. their ceo, jer crane, wrote a postmortem that was honest in the way only postmortems written at 3am can be.

i'm not running pocketos scale. it's me and brandon. we run five agents in production right now, all on top of openclaw. one drafts customer onboarding emails for a boutique consultancy. two are writing-and-research agents for our own content pipeline. one schedules and posts. one is a janitor agent that does database hygiene on a tiny supabase instance. that is the entire fleet.

and yet, reading the pocketos writeup, i realised three of those five agents had credentials they did not need.

so we spent monday tearing the access model apart. here is what we changed and what it cost.

what we actually had

each agent ran inside its own microvm. that was already true and it is the whole reason we chose the boutique approach instead of throwing everything in one container. but each microvm was bootstrapped with a single .env file we copied in at launch, and that file had every key the agent might one day plausibly need. supabase service role. resend. stripe in test mode. the openrouter master key. github read.

it was fine because nothing had ever gone wrong. that is also exactly what jer said.

what we changed

three things, in order of how painful they were to do.

one. one credential per agent, scoped. the janitor agent now has a supabase service role key whose row level policies only let it touch three housekeeping tables. it cannot select from users. it cannot select from orders. it tried, on tuesday, while running its normal cleanup, and the supabase log showed the denial. that was the whole point.

two. no api token leaves a vault. every key now lives in a tiny vault keyed by agent id and rotated on a schedule. the agent gets a short lived token at the start of a task and the token dies when the task ends. cost: about an afternoon of brandon's time and one new tile in our observability dashboard so we can see when a token leaks past its task boundary. nothing has yet, but it is the first thing we'd want to see.

three. destructive ops require a human in the loop, even for agents we trust. every drop, delete, truncate, or file rm goes through a small approval queue. for our scale this is fine. brandon clears the queue twice a day. for somebody at pocketos scale this would be too slow. for us it costs roughly two minutes a day and removes a category of mistake entirely.

what it cost in numbers

mrr in april was $4,180. we lost about a day and a half between the two of us doing this work, which is real because we are two people. no customer churn from it. one customer asked what changed and got the writeup, which they liked.

token spend on the agents went up about 9 percent because the short-lived-token flow adds a roundtrip on every task. fine.

the part i did not expect: the email-drafting agent stopped trying to write to a logging table it had been silently scribbling to for two months. nobody had noticed because the writes succeeded. so the cleanup also surfaced a real bug.

the part i keep thinking about

pocketos was not bad ops. they were running a real product on railway with a real backup story, and an agent with too much trust and too much reach broke the whole thing in the time it takes to send a slack message. the lesson is not "don't use agents." the lesson is that any agent allowed to make a destructive call against a real system needs the same suspicion you'd apply to a junior contractor on day one.

what changed in your stack this week because of pocketos? curious whether anyone else did the audit or shrugged it off.