I keep running into the same problem with AI tools:
They're great at reasoning, but terrible at remembering. Important context gets lost across sessions and I keep having to re-feed it (I guess I'm not the only one).
That became painful enough that I ended up building Kumbukum — an open source memory infrastructure for teams and AI tools.
The idea is simple: make context persistent, searchable, inspectable, and editable, so assistants can pull the right information instead of starting from scratch every time. However, and this is key, I wanted to build something that's not just for AI tools, but for teams in general. So you get a clean UI to manage your team's collective knowledge, and an API that any tool can integrate with. I wanted something teams can actually read, manage, edit, and self-host if they want.
Right now it supports things like:
• notes
• memories
• URLs (with whole site indexing)
• relationships between them
• Git sync
• and I'm currently adding email too
It also includes a browser extension that can extract information from any webpage and send it to Kumbukum with one click.
I'm curious how others here are handling this.
Are you:
• just relying on chat history?
• summarizing manually between sessions?
• using RAG on top of docs?
• building your own internal memory system?
• using MCP-based setups already?
Would genuinely love to hear what's working and what still feels broken.
If useful for context:
• https://kumbukum.com
• https://github.com/kumbukum/kumbukum
The persistent context problem is actually two distinct problems. Storage is the easier half. The harder part is retrieval precision: which context is relevant now, not which context exists. RAG approaches solve the first and often fail the second. The MCP pattern is interesting here because it moves the retrieval decision to the host, not the model.
This is a real problem. The constant context reset gets frustrating fast.
I like that you’re treating memory as something teams can actually see and manage, not just something hidden behind prompts.
But, how are you thinking about relevance over time though, like what gets surfaced vs ignored as things grow.
Great question, and as a fact, I was just working on this today. As a further point, I'm just about to run some benchmarks to see whether my thinking was correct. Give me 30 minutes, and I'll get back to you with some numbers and an explanation of how I tackled it.
Sharp problem to build around.
A lot of AI tools compete on intelligence, while users quietly suffer from continuity loss between sessions. Would try it.
Yep totally. Everything is built by developers for developers. Kumbukum's approach is a bit different. Users are at the center. Easy ways to get data in, edit it, and understand it. AI tools crunch it.
Context management is honestly the biggest bottleneck right now. I find myself constantly repeating the same 'architectural rules' to different AI agents just to keep them on track. Are you using any specific vector DBs or tools like Mem0 to handle this, or is it still a manual copy-paste game for you? I feel like we’re all just waiting for a universal 'context layer' that actually works across the stack.
ran into this when swapping providers last month - rebuilt pretty much everything except the memory layer, which survived clean. I've started thinking of it as the job contract: what the agent needs regardless of what model's underneath.
This resonates a lot. I've been building production Flutter/Firebase apps and the memory problem hits differently when you're a solo dev — every new session with an AI tool means re-explaining your entire architecture, naming conventions, project context.
I've been handling it by keeping a detailed markdown file with my stack decisions, Firestore structure, and key patterns that I paste in at the start of sessions. Works but feels like a hack.
The Git sync feature you mentioned is interesting — does it index commit messages and PR descriptions too? That would be genuinely useful for keeping AI context aligned with actual codebase changes.
Will check out Kumbukum. Self-hosting support is a big plus for anyone working with sensitive business logic.
Comment:
What you’re describing is exactly where most setups start breaking — not storage, but retrieval under real usage.
In practice, a lot of systems capture context fine, but when you actually need it, the recall depends heavily on how it was structured and labeled in the first place.
I’ve seen cases where better naming + tighter semantic grouping outperforms heavier RAG layers, just because the system can “recognize” what matters faster.
Curious — are you optimizing more on the storage/retrieval side right now, or starting to think about how information gets shaped at input too?
Thank you for your comment and feedback. Your questions are spot on. I've been doing Digital Asset Management with Razuna - https://razuna.com - for over 20 years. This has taught me some things about distributed networks and teams (hopefully) :)
So, with Kumbukum, I took the same approach, built a nice UI (again hopefully), and enable users (not directly developers) with an easy way to get data in and once in, to make it editable.
On input, yes, with our browser extension, you can add notes, URLs, and now emails (will be released in a few days). To answer your question, yes, the input data is being formatted and processed. Once in the database (mongodb) we use Tyoesense to create an index and embeddings. So everything is already pre-formatted for the AI tools.
I've been coding with Codex and the Kumbukum MCP server and it flies!
Will most likely start making some videos.
Let me know if this answers your questions. Happy to discuss further.
Nitai, the "re-feeding context" loop is exactly where AI efficiency breaks down, and building Kumbukum as open-source memory infrastructure is a massive step toward fixing that "starting from scratch" problem. By prioritizing searchability and Git sync alongside an API for tool integration, you're shifting context from a temporary chat session to a persistent team asset, ensuring that collective knowledge actually compounds over time.
I’m currently running Tokyo Lore, a project that highlights high-utility logic and validation-focused tools like yours. Since you’re building the definitive infrastructure for persistent context and team memory, entering Kumbukum could be the perfect way to turn your own validation journey into a winning case study while your odds are at their absolute peak.
You are spot on. What is Tokyo Lore about?
Glad it resonated 🙂
Tokyo Lore is a small, focused round where we highlight:
→ early-stage tools
→ strong underlying ideas/logic
→ and builders solving real problems
It’s not a typical “launch platform” — more about getting your idea in front of thoughtful builders and seeing how it actually lands.
For something like Kumbukum, the value would be:
→ how people react to the “persistent context” idea
→ what use cases stand out
→ where it clicks vs where it needs clarity .
Tokyolore.com