Built an open source memory system for AI agents. Auto prunes noise
so context stays clean.
Most agentic memory is markdown files that bloat forever. YourMemory
decays memories based on retrieval rate, importance, and category
successful patterns stick, failed approaches fade fast. Anything below
threshold gets pruned. Local first, multi user, MCP native.
Solo dev, six months in, shipped last week. Four findings:
1. Benchmark wins ≠ distribution. 84.8% on LongMemEval-S vs Zep's
71.2% and Mem0's 49%. Generated almost no inbound. Self reported scores
are correctly discounted in this space, the fix is independent
verification, which is a relationship problem.
2. Setup time > algorithm. Most praised feature wasn't the decay
model. It was pip install + one command that auto configures Claude
Code, Cursor, Cline. Competitors need Docker. I treated install as a
weekend afterthought. Mistake.
3. Launch traffic decays fast. Show HN front-paged, ~100 points.
A week later, gone. Real distribution comes from comparison-article
authors and reviewers over months, work I budgeted zero time for.
4. Licensing is strategy. Picked CC-BY-NC to keep options open.
Every comparison article has a license column, and non commercial gets
cut from shortlists. Reconsidering.
Two questions:
Moved a project from non-commercial to commercial path, how did
you think about it?
Got something independently verified by a third party, how did
you find them?
All four findings ring true from running TokRepo (skill marketplace for AI agents). Two adds with concrete numbers plus a sharp answer to your licensing question:
On finding #1 (benchmark not equal to distribution): the bridge is what we call social benchmark — get 3-5 named developers (not orgs) to publicly say "I switched from Mem0 to YourMemory and saw X." Self-reported numbers are noise; named-dev quotes are signal. Find them by searching 'import mem0' + 'import zep' on GitHub, DM the top 20 contributors with a 2-line ask plus free white-glove setup. We got 7/20 to do it. Took 4 weeks. Worth ~3-5x the inbound the launch generated.
On finding #2 (setup time): measure the full first-skill-fired time, not just install. We tracked: pip install median 38s, but median time to 'agent successfully retrieved a memory' was 14 minutes. The 13.5min gap is where 60% of evaluators bail. A demo flag that runs a 30-second canned task (write file, retrieve memory, prove it worked) cut bail rate from 60% to 23%.
On finding #4 (CC-BY-NC reconsidering): flip to BSL (Business Source License) v1.1 with a 4-year conversion to Apache. You keep commercial-resale protection (no AWS-style fork-and-host), but every comparison-article checkbox treats it as 'OSS-compliant for non-AWS use' — which is 95% of your evaluators. Sentry, MariaDB, Couchbase all use this. CC-BY-NC reads as scary-corporate to engineers; BSL reads as responsible-OSS. Same protection, different signal.
Bonus on the 'YourMemory sounds like a feature inside someone else's stack' comment elsewhere — agree. The naming reads as noun-object, not layer. 'Memory.run' / 'Recall' / 'MemoryStore' stand alone in a way 'YourMemory' doesn't. Worth A/B-ing the README h1.
Re: independently verified — pitch the 2-3 newsletter writers in this space (Latent Space, Interconnects, AI Engineer) on shipping their saved memories from the past 6 months as a verifiable demo. They get content; you get attribution. We did this with a smaller dev-tooling newsletter — 1 article = 6 weeks of organic search inbound.
Starred. Watching.
You built the hard part. The distribution drag is mostly because “YourMemory” still sounds like a feature inside someone else’s stack, not the memory layer itself.
When infra buyers compare tools, generic names get read as utilities.
Not infrastructure.
Not category-defining.
Just another plugin.
That matters more in your category because trust gets priced before benchmarks do.
You’re already seeing it:
better evals, better install, weaker pull.
That usually means the product is stronger than the frame.
A name like Vroth.com, Davoq.com, or Exirra.com reads much closer to actual infra than “YourMemory” does, especially in comparison tables where people are scanning fast and making trust shortcuts.
Strong infra products rarely lose on capability.
They lose on perceived weight.