Last week we shipped sales-email auto-classification on FORMLOVA, the chat-first form service we've been building.
Here is the part that is probably interesting to this crowd: we made it free on every plan, including the free tier. No upgrade prompt. No quota. The LLM cost sits entirely on us.
I want to walk through how we arrived at that call, because it was not a "we're generous" decision. It was a cost-math decision that flipped a common SaaS instinct.
When a form response arrives, we classify its content with an LLM into one of three labels: legitimate, sales, or suspicious. The label is shown in the dashboard and can be used to exclude sales emails from analytics, filter workflows, or auto-reply only to certain buckets. The operator can correct a label by hand, and that correction is never overwritten by future auto-runs.
As far as we can find, no form service does this today. CAPTCHA stops bots, but no one tries to classify the content of the human-written messages that get through.
When you add an AI feature, the default pricing instinct is:
This is sensible. LLM calls cost real money. Pricing controls cost. Done.
We almost did the same. We had a plan: "AI spam classification is a Standard-and-above feature. Free users get one manual filter column." Two days of pricing sketches later we scrapped it. The math made a different decision for us.
Our classifier uses Claude Haiku 4.5 via OpenRouter. It runs asynchronously after form submission.
Per classification: roughly $0.0002, which is about 0.03 Japanese yen.
So a free-tier user hitting our monthly response cap (100 responses) costs us $0.02 per user per month in classification.
A Standard-plan user with a 1,000-response monthly cap costs us $0.20.
A Premium user with 10,000 responses costs $2.
Those numbers do not bend the business. They are not even a rounding error on infrastructure spend.
Here's the hidden cost of gating a $0.02-per-user feature:
All of this to protect $0.02 of LLM cost per user per month.
Building the gating infrastructure was going to cost us more engineering time, and cost users more confusion, than just absorbing the bill.
So we absorbed the bill.
The cost math made the decision easy. The positioning made the decision obvious.
For anyone running paid ads -- our target users -- filtering sales emails out of your inquiry pipeline isn't a premium. It is the baseline for calculating CVR correctly. If your form is delivering 10 responses and 8 are sales pitches, your ad CVR numbers are lying to you by 4x. That's not a luxury problem. That's a did the campaign work problem.
If we charged for this, we'd be saying: "We'll tell you the truth about your ad performance, but only if you pay extra for the truth." That framing was indefensible. So we put it on every plan.
In other words: the feature's job is to make the rest of the product's numbers trustworthy. Gating it would undermine the trust the rest of the product is trying to build.
We now have a positioning line we couldn't use before: "FORMLOVA ships sales-email detection free on every plan -- the only form service that does." That is a short, specific, defensible line. It is not competing on feature count or polish. It is competing on what is included by default when you sign up.
Every "default included" is a line you can draw against competitors. Free tiers are not only acquisition tools -- they are positioning statements.
I should be honest about the shape of this. Not every AI feature can be free on every plan. The three things that made this one qualify:
If we were generating form copy from scratch (long prompts, long outputs, no natural cap), the free-on-every-plan approach would not survive the math.
Even with the cost math in our favor, we didn't go in without cost controls:
max_tokens: 256 on the output. The JSON is small. We don't let the model run long.temperature: 0. Deterministic. Keeps prompt cache efficient.null labels. The form submission itself never breaks.That last one matters. Paid-event forms (Stripe Connect) skip classification entirely -- people very rarely spend money just to send you a sales pitch. That's a free efficiency gain.
Worst case: a wave of spam bots figures out how to send content-triggering classification loops to our endpoints. Our unit cost climbs, we burn a couple hundred dollars, we add the quota we didn't build initially. Cost-bounded, reversible.
Best case: we get to say "we give you trustworthy pipeline numbers by default, on every plan," and that line does a lot of quiet positioning work.
I think best case and worst case differ by about two orders of magnitude. The expected value is strongly positive.
When you're adding an AI feature, run the math before you build the gating. Calculate cost per user per month, not cost per call. Compare it to the engineering time and user friction the gating will cost. If it's a small async feature where the unit of work is bounded, you might be looking at a feature that should be on by default.
Features gated behind Pro plans are a position. Features free on every plan are also a position. Pick the one that fits what you're trying to be.
For us, AI spam classification being free on every plan says: we want your CVR numbers to be real. That is the position we wanted.
Related posts:
after() + OpenRouter): coming soon on DEVFree to start at formlova.com. Connect via MCP from Claude, ChatGPT, or any MCP client.
Love this level of transparency! The math ($0.0002 per call) really puts things into perspective. For AI-RPA projects like mine, managing token burn for autonomous agents is a huge concern for users. Making features like this 'default free' is a killer positioning move. Are you using any specific caching strategy to keep those Claude Haiku costs so low?
Thanks — glad the numbers resonated. AI-RPA is exactly the kind of use case where per-call cost visibility matters, because the blast radius of a silent token burn is huge once agents run unattended.
Honest answer on Haiku costs: most of the saving isn't from caching, it's from scope. We only classify one thing (is this submission sales spam or legit?), so prompts are short, outputs are a single label + score, no conversation history, no tool use, no retries. Single-shot, stateless, tiny context window — that alone gets you most of the way to $0.0002.
On top of that:
We're not using Anthropic prompt caching yet — the prompts are short enough that the break-even point isn't there. If we ever move to longer system prompts with few-shot examples, that'll change.
Curious how you're handling token budgeting on the RPA side — hard caps per agent, or more of a soft budget with alerts?