Built mnemo as a solo-dev: Search 89,037 AI coding chat from history in 0.8s

I had 89,037 messages scattered across 12 AI coding tools. These weren't just chat logs - they were my decision-making journals. Every architecture choice I made at 2am, every debugging session, every trade-off I accepted and why.

The mess looked like this:

Claude Code: JSONL files in ~/.claude/projects/
OpenCode: JSON in ~/.local/share/
Cursor: SQLite in globalStorage
Gemini CLI: JSON in ~/.gemini/sessions/
Plus Crush, Amp, Codex, Kiro, Cline, Roo Code, and more

Every tool had its own format.
Want to find that auth flow discussion from last Tuesday? Impossible.

So I built mnemo

I developed mnemo to index everything into one local SQLite database with full-text search. No cloud, no accounts, everything stays on your machine.

brew install Pilan-AI/tap/mnemo
mnemo index
mnemo search "authentication flow"

Technical Architecture

1. Format Adapters (12 indexers)

Built dedicated parsers for each tool's native storage:

JSONL parsers: Claude Code, Codex, Antigravity
JSON parsers: OpenCode, Gemini CLI, Amp, Kiro, Cline, Roo Code, Kilo Code
SQLite readers: Cursor, Crush

Each adapter:

Auto-detects tool paths across macOS/Linux/Windows
Handles format variations (OpenCode changed schemas twice)
Deduplicates by content hash
Normalizes timestamps to UTC
Atomic transactions (full session or rollback)

2. Database Design

Everything lives in ~/.mnemo/mnemo.db:

FTS5 virtual table with BM25 ranking - no external search engine
WAL mode - concurrent reads never block
Read-only mode for hooks - inject operations can't be blocked by background indexing
Single-writer constraint (MaxOpenConns=1) - prevents "database is locked" errors

The database is portable. Copy it to another machine, all sessions come with it.

3. Search Ranking Algorithm

FinalScore = (BM25 + densityBonus + userBonus) × temporalDecay

BM25 — SQLite FTS5 built-in: term frequency + document length normalization
Temporal decay — e^(-0.03 * daysOld) where last week = 80%, last month = 40%, 3 months = 10%
Match density — Sessions with multiple hits get boosted (capped at 20 matches)
User preference — Your prompts weighted 2x over AI responses

Results grouped by session, not scattered messages. When you search "liquid design," you get the 5 most relevant conversations - not 50 lines from the same thread.

4. Zero Runtime Dependencies

Pure-Go SQLite (modernc.org/sqlite)
→ Cross-compiles everywhere, no C toolchain, no CGO

Single static binary
→ Works on first run, no config files

Local-only processing
→ Data never leaves your machine, no telemetry

5. MCP Server + Plugins

Claude Code plugin: Auto-context injection via UserPromptSubmit hook

Keywords extracted from your prompt
FTS5 search runs (~0.8s, no network)
Top 3 sessions formatted as context (~200-500 tokens)
Injected before Claude processes your request

OpenCode: Skills + hooks integration
Cursor/Claude Desktop: MCP server exposes mnemo_search, mnemo_context, mnemo_recent

Token overhead: ~0.1-0.3% per session.
The savings? Your AI remembers what you discussed last week.

Why Go?

Performance: Indexing 89K messages takes ~3 seconds
Portability: Single binary, runs on M-series Macs, Intel, Linux arm64/amd64, Windows
Simplicity: No npm install, no Python venv, no Docker

Results

Now I can:

Search 89K messages in ~0.8s
Find "why I chose JWT vs sessions" instantly
Load project context into new AI sessions
See what I worked on last week with mnemo recent --days=7

My AI coding sessions are now my searchable decision journal.

What's Next

mnemo is the memory layer. Later this month, I'm launching Pilan - a native macOS app that adds knowledge graphs, pattern recognition, and session intelligence on top of mnemo.

If mnemo is the memory, Pilan is the brain.

Questions for You

Which AI coding tools do you use? If yours isn't supported yet, I'd love to add it.
Cross-tool session detection: Should mnemo detect when you discussed the same topic across different tools and link them?
Export formats: Would you want mnemo to export sessions as Markdown for documentation?
Performance: Anyone with 100K+ messages? I'd love to benchmark against your dataset.
Semantic Search: Do you think I should add vector based semantic search to mnemo?

Contribute

Thanks to everyone who checked out mnemo's repo on GitHub. I'm honored to have 12 stars already!

The repo needs help with:

Windows path detection refinements
Additional tool adapters (Aider, Continue, etc.)
Performance optimizations for 500K+ message datasets
Documentation improvements

GitHub: github.com/Pilan-AI/mnemo

எண்ணென்ப ஏனை எழுத்தென்ப இவ்விரண்டும் கண்ணென்ப வாழும் உயிர்க்கு.
"Numbers and letters - these two are the eyes of all who live."
— Thiruvalluvar, Tirukkural 392