Last week, I open-sourced a lightweight Code MCP server that uses AST (Abstract Syntax Tree) parsing to give coding agents semantic understanding of your codebase. It went viral on X with 90K+ views.
Here's the tweet that started it all:
https://x.com/RoundtableSpace/status/2031366453153157139
https://x.com/GithubProjects/status/2031233621382853030
The agent doesn't need your whole file. It needs to know what functions exist, what classes are defined, and how they relate to each other.
I built cocoindex-code - a super lightweight embedded MCP that:
The result? 70% token savings and noticeably faster coding agent responses.
For Claude Code:
pipx install cocoindex-code
claude mcp add cocoindex-code -- cocoindex-code
For Codex:
codex mcp add cocoindex-code -- cocoindex-code
That's it. No database, no API keys, no config files. It just works.
When your agent needs to find code, it calls the search MCP tool with a natural language query and gets back exactly the relevant code chunks with file paths and line numbers.
I think people resonated with a few things:
Python, JavaScript/TypeScript, Rust, Go, Java, C/C++, C#, Ruby, Kotlin, Swift, SQL, Shell, and more. It uses tree-sitter grammars so adding new languages is straightforward.
We're actively working on:
nomic-ai/CodeRankEmbed with a GPU)The repo is at github.com/cocoindex-io/cocoindex-code - 670+ stars and growing fast.
Built with CocoIndex, our open-source Rust-based data indexing framework.
Would love to hear your experience if you try it out. Drop a comment or open an issue on GitHub!
Semantic retrieval feels like a real unlock.
The part I keep feeling next is that once the agent has enough context to act, the bottleneck stops being retrieval and starts being the execution boundary. Better search plus one visible plan before local mutation feels stronger than either alone.
Really nice job keeping the setup this short.