I want to start with the one number that made me build the feature this post is built around.
10 inquiries that month. 8 of them sales pitches. 2 real prospects.
I had just finished a clean ad report for a client. CVR looked great. CPA looked great. The "winning channel" was clearly winning. I almost shipped the report. Then on the way out the door I opened the inbox and started reading.
By the tenth message I realized every report I'd written for that client all year had been quietly lying to itself. Real lead count was a quarter of what I was reporting. Channel allocation was based on garbage. The CPA I was using to advise on next month's budget was off by 4x.
That's the cost. Not "my inbox is annoying." The cost is bad data shaping real money decisions, and real prospects getting lost in the noise.
If you're an indie founder with a contact form on your site, this happens to you too. You just may not have measured it.
There is a mature SaaS category in Japan called "form sales" -- vendors with multi-million-company databases that send templated outreach to thousands of contact forms a month. Pricing is around USD 300-2,000 a month for the tools, or USD 0.05-0.70 per send for human-contractor services. Equivalent layers exist in the US (Apollo, Clay, Smartlead in adjacent niches; pure form-spam tooling is more nascent but technically solved).
For a B2B SaaS contact form, these senders typically account for the majority of inbound volume. From running ops for client forms over years, my own estimate: advertising agencies, recruiting firms, and consulting outfits combined are around 70% of the inbound pitches.
That's the supply side. The demand side -- the receiver -- has had no good answer.
These are basic, but I am genuinely surprised how often founder sites are missing two or three of them.
If you implement all five, your noise floor drops a lot. It does not go to zero. The reasons:
There is a structural ceiling on "make sending harder." The next move is a different layer.
The framing change: stop trying to block, start sorting at the inbox.
For years the answer was manual triage. Open each response, label it as prospect / sales / unclear, drop the sales rows from the report. At ~1 minute per inquiry, a form with 50 inbound a month is 25-50 minutes of unpaid work. Across an agency book, hours.
The lazy alternative is to skip triage and ship the lying numbers. I've watched plenty of operators do this. They're often not aware their numbers are wrong by a multiple.
The third option, available now in a way it wasn't two years ago: ship every response through an LLM, label it legitimate / sales / suspicious, and let the operator filter.
LLM cost has dropped enough that an entire form service can absorb this on a free tier. We pay roughly USD 0.0002 per response in our setup. That's basically zero in unit economics for any plan that has a price.
The temptation is to delete sales-labeled responses. Just hide them. Less inbox noise, no manual filter step.
Don't.
Even at 99% accuracy, you misjudge one real inquiry per hundred. Reading a sales pitch costs you a minute of attention. Silently dropping a real prospect costs you a lead, a customer, a relationship. The asymmetry is brutal.
So the rule we built into our classifier prompt is "when in doubt, mark legitimate." Gray zone goes to the safe side. The classifier outputs both a label and a 0-100 score, so the operator can see uncertainty and override. Manual overrides are protected from being wiped by future re-classification.
The model proposes. The human decides. Always.
I checked. Google Forms, Typeform, formrun, Tally, SurveyMonkey, Microsoft Forms -- all of them stop at CAPTCHA-class protection. None of them classify response content. You can wire it up yourself in Zapier with an OpenAI call, but you own the cost, the prompt tuning, the failure modes, and the manual-override UI.
We built it into FORMLOVA as a default, free across every plan including the free tier. As of writing, it's the only mainstream form product where this is shipped. Not because the technology is hard -- it isn't anymore -- but because few founders treat the form as the actual entry point of their pipeline.
I'm biased, obviously. But if you're an indie founder, you're closer to the data than anyone. You see when a real lead leaks. Spending 25 minutes a month doing manual triage, or shipping bad-data reports, are both bad uses of your time. Either build the classifier yourself (the patterns are simple), or take the shortcut.
This is one piece of a multi-platform English-language series on contact-form spam defense.
Companion piece on the founder side:
Most founders treat contact forms like inboxes, but they’re really data sources for growth decisions.
Once spam pollutes that input, it doesn’t just waste time, it distorts everything downstream.