The problem: Startups are burning cash on the wrong AI models because proper evaluation is too complex and time-consuming.
The solution: NovaEval - an open source framework that makes enterprise-grade AI evaluation accessible to everyone.
Why this matters for indie hackers:
• Stop overspending on AI APIs (we've seen 40% cost reductions)
• Make data-driven model decisions, not gut-feeling ones
• Build evaluation into your product from day one
• No vendor lock-in - evaluate any model, anywhere
The ask: We need builders who want to shape this space. Whether you're a Python dev, AI tinkerer, or docs wizard - there's a place for you.
What we've built so far:
• Multi-provider support (OpenAI, Anthropic, AWS, custom)
• Production-ready deployment options
• CLI and Python API
• Comprehensive scoring framework
Get involved:
• Star us: https://github.com/Noveum/NovaEval
• Try it: pip install novaeval
• Contribute: Check out our good first issue labels
Question for the community: How are you currently evaluating AI models in your projects? What's working? What's not?
#IndieHacker #AI #OpenSource #Startup #BuildInPublic #MLOps
Yash
Would you like to become a contributor?
🔥 This is exactly what the ecosystem needs right now. Evaluation often becomes an afterthought, and teams end up paying for hype instead of results. Love how NovaEval puts power back in the hands of builders — open source, multi-provider, and cost-conscious.
I’m especially impressed with the CLI + Python API support — that’s a win for dev workflow. Curious: do you plan to add benchmarks or visual dashboards soon?
Big props to the team! Just starred the repo. Will be trying it out with a few LLM workflows we’re exploring.