I built a small LLM app, then added evals and guardrails

by mindmnml

Cold Lead Decoder takes a company domain and returns a structured lead card:

company summary
positioning signals
likely pain points
a grounded cold-email opener
follow-up angles

The first version was easy to demo.

The harder part was making it behave when inputs are messy:

thin websites
invalid model output
prompt-injection attempts
generic openers
unreachable domains
schema failures

So I added:

Zod as the output contract
repair retries when schema validation fails
degraded states instead of raw 500s
SSRF protection on fetched URLs
banned-phrase checks for generic openers
a small eval harness with fixed fixtures
a live /eval page for operational metrics

It is not a scaled product. No users/revenue story here.

The point was to practice turning one LLM feature from “cool demo” into something more testable and inspectable.

Live: https://coldl.vercel.app
Eval page: https://coldl.vercel.app/eval
Code: https://github.com/nikabokuchava/cold-lead-decoder

Curious how other people are testing LLM features before they trust them in real workflows.

on June 7, 2026

Say something nice to mindmnml…

Trending on Indie Hackers

Most founders don't have a product problem. They have a visibility problem

106 comments Day 4: Why I Built a $199 Workspace Nobody Asked For

55 comments Spent months building LazyEats AI. Spent 1 day realizing I have no idea how to get users.

35 comments Hi IH — quick update. The MVP is live.

28 comments I Built a Football Sentiment Platform in 18 Days. The World Cup Starts in 7 Days. Now I Need Distribution.

17 comments Built an n8n booking alert system — is cold outreach dead for B2B micro-tools?