I Stopped Measuring Our Form Product by Submits. I Started Measuring by Recall.

Most form products measure themselves by submissions.

That is the demo metric.

It is on the dashboard, it is in the launch post, it is in the year-end summary, it is in the case studies. "X forms created, Y submissions collected."

For the first six months of FORMLOVA I was looking at submissions too.

It made sense. The product is a form product. Forms collect submissions. More submissions, more value, right?

I was wrong, and the reason I was wrong is something I think is worth writing down for anyone building in this category, because the same trap is sitting under most form-adjacent ideas.

The submission metric is misleading

The first reason it is misleading is that submissions are a moment.

A submission happens, gets counted, and is over.

If your product helps a team collect more submissions in a quarter, that is great, but the quarter ends. The graph resets. The metric does not compound.

The second reason it is misleading is that submissions hide what happens after.

A team that collects 200 submissions and follows up on zero is in worse shape than a team that collects 20 submissions and acts on every one. The submission count cannot tell the difference. It looks great in both cases.

The third reason it is misleading is that submissions reward the wrong product behavior.

If submission count is the metric, the product team will keep optimizing the intake surface. Bigger submit button. Cleaner mobile UX. Fewer fields. More themes. More animations.

All of that is fine, but it is the easy quarter of the product.

The other three quarters of the product, which are about what happens to the responses over time, get no metric pressure at all.

So I changed the metric.

The metric I actually wanted

What I wanted to grow was not submissions.

What I wanted to grow was the number of times someone on the team came back to a past response and got value from it.

I started calling that "recall."

A recall is when:

A teammate opens a response from three months ago to remember what the customer asked.
A salesperson looks up the prior submissions of an account before reaching out.
A product manager pulls feedback from the last 90 days to inform a roadmap decision.
An AI client searches the response history to answer a respondent's follow-up question.
A compliance officer audits the decisions made on a response months later.

Each of those is a recall.

A recall is the moment a response stops being event data and starts being memory.

A team that recalls often is a team for which the form responses became an asset.

A team that never recalls is a team for which the form responses became a graveyard.

A four-message conversation that turned out to be the product

The clearest moment I had where the metric swap clicked was watching a customer use the product in a way I had not designed for.

They opened a chat with FORMLOVA and asked:

> 回答一覧の名前を出して

The product replied with a list of twelve names, working from already-fetched context rather than re-querying. Then they asked:

> 参加人数が3名の名前を一覧で教えて

Two names came back: 小林直樹 and 鈴木花子. Then they asked:

> その方々が回答内容全てを教えてください

Full details came back: response IDs 39121 and 39116, submission timestamps, email addresses, phone numbers, the "参加人数：3名" field, and small free-text notes from each respondent.

Then the kicker:

> その2名を対応中のステータスにして

Reply:

2件まとめて更新します。
完了しました。小林 直樹（3/26）と鈴木 花子（3/8）の2件を「対応中」に変更しました。

Four messages. A bulk status update on a filtered subset of responses, with no dashboard ever opened. The status transition was logged in audit_logs, attached to the records as decision provenance, and the responses now had stable status that would survive even if the original operator left the company.

This was not a single submission. It was a moment of recall, layered on top of submissions that had happened weeks earlier. The submissions had been counted long ago. The value was being generated right now, in the act of recalling and acting on them.

If I had been measuring submissions, that moment would not have moved any metric I cared about. The submissions had already happened. The recall happened because the responses still had stable identifiers, durable status, and a chat surface that could operate on them as a record set.

After enough of those moments, I stopped looking at the submission graph.

Why this changes everything

When you swap from "submit volume" to "recall rate," your product priorities shift in ways that are uncomfortable at first.

You stop investing in features that look great on a launch tweet but only matter for the moment of submit.

You start investing in:

Response search across all forms a team has ever published.
Stable respondent identity across forms.
Decision history that survives a form redesign.
A record that stays legible after the original form was retired.
Exposing the response history as an MCP surface so AI clients can recall it cheaply.

None of these features look impressive in a launch demo.

All of them are what makes the product valuable at year three.

In FORMLOVA, this shows up in small concrete decisions. Bulk status updates happen on the response record, not in a filter view, so they survive a retired form. The respondent identifier is a separate column (respondent_identifier), not derived from whatever field the form happened to ask for. The spam classification is stored as state on the response (spam_label, spam_label_source), not in a presentation filter. The audit_logs table records every L1-or-above operation with cursor-based pagination, and chat can ask for it via the get_audit_logs tool. Every one of these is the kind of feature that has zero punch on a launch tweet and high leverage at year three.

This is the kind of trade most form products do not want to make, because the launch demo is also the founder's mental model of the product. Recall does not photograph well. Submit count does.

I keep choosing the unphotogenic one.

A pricing question that follows from the metric

There is a downstream consequence I did not expect when I made the metric swap.

If recall is the metric, then the data needs to stay with the operator forever. You cannot hold the responses hostage to a plan upgrade and still claim you are growing recall. So FORMLOVA's pricing got reshaped around a clean line: data export is on every plan, including Free; what sits behind the paywall is the ongoing operational layer.

The plans now look like this:

Free   : forms unlimited, responses unlimited, CSV/Excel export,
         response search, status management, basic workflows,
         basic dashboard/analytics
Standard 480 yen/month : auto replies, reminders, conditional emails,
                          scheduled actions, detailed analytics, imports,
                          CRM sync, email branding, Google Sheets ongoing sync,
                          higher email caps
Premium  980 yen/month : mailing list bulk sends, drip email sequences,
                          paid event forms (Stripe Connect), team management,
                          audit logs, higher email caps

The data lives wherever the operator can reach it. Free plan operators still get full CSV/Excel export, full response search, full status management, full workflow basics. The Standard upgrade is the long-running sync layer (Sheets sync, auto replies, conditional emails, scheduling). The Premium upgrade is the team operations layer (bulk sends, drip sequences, audit logs, Stripe Connect for paid events).

The reason this matters for the metric swap is straightforward. If submission count was the metric, the natural pricing move would be to throttle volume at Free. Charge for more submissions. Charge for the right to keep more responses. Charge for getting your data out.

If recall is the metric, that pricing is incoherent. The product cannot grow recall while making it harder to recall. So the pricing decision had to follow the metric decision.

I think most form products have the pricing they have because they are still measuring submissions. The recall framing makes a different shape of pricing look obvious.

What it does to product strategy

The recall framing also changes how I think about competitive positioning.

Most of the form market competes on intake.

Better visual builder. Better mobile rendering. Better integrations to the most popular notification surfaces. Better AI form drafting.

That is a crowded space and the wins are linear.

The recall side of the market is barely contested.

Almost no form product treats response history as a first-class long-lived asset.

The dashboards I have seen are designed for "this week," "this month," and "export to CSV."

That is the surface you would design if you thought of responses as events.

It is not the surface you would design if you thought of responses as records.

The unbuilt product is the recall product. The one where a team can ask: "what did this customer ever say to us?" and get an honest, ordered, decision-aware answer in one place.

I do not think anyone has properly shipped that for the long tail of small and medium teams.

That is the wedge I would rather sit on than another visual builder.

What I tell other founders

If you are building anything in the data-collection category, ask which metric you would defend.

If you would defend submission volume, you are building a campaign tool. That is fine, but it is a different business with a different lifetime.

If you would defend recall rate, you are building a memory tool. That is harder to grow in month one and easier to keep in year three.

Most founders pick submit because it grows faster.

A few pick recall because it lasts longer.

The choice is a positioning decision, not a feature decision.

It is also harder to undo than most people realize. A team that has been measured on submissions for two years has trained its product instincts around the intake side. Asking that team to suddenly invest in record-layer features feels like a distraction.

The earlier you make the swap, the cheaper it is.

There is a subtler version of this trap I keep running into. "Slack notification" looks like a signal that the system is working: messages are going out, the channel is busy, the dashboard shows fired notifications. It is not the same as the team responding. A response that pinged Slack and never got owned is functionally indistinguishable from a response that pinged Slack and got handled, if Slack is the only signal you are watching. The recall metric forces a different question: did the team come back to this response with intent, or did it just get pinged at? FORMLOVA tracks that distinction by keeping the response status (new / in_progress / done / excluded) on the record itself, advancing it automatically when an operator replies via reply_to_respondent, and surfacing the gap when a notification fired but the status never moved. Slack notification is not ownership. Auto-reply enabled is not delivered. Build the metrics around the second statement in each pair.

What FORMLOVA does about it

In practical terms, this is what changed in the product after the metric swap.

Responses keep a stable identifier (response ID like 39121 or 39116 in the example above) that does not break when the form is edited or republished.

Respondent identity exists as its own concept via respondent_identifier, so the same person across multiple forms is visible as one history.

Decisions made on a response (excluded, tagged, owned, replied) live on the record as append-only provenance. A retired form does not erase those decisions. The audit log preserves who decided what, when.

Status (new, in_progress, done, excluded) is part of the record and transitions are audit-logged. Bulk status updates land atomically from chat, as shown in the four-message example earlier.

Spam classification is stored as state (spam_label plus spam_label_source), not generated on the fly per query. Reports asking "exclude sales pitches" return the same answer each time.

Search across all responses, not only inside a single form, is a first-class operation via MCP tools like search_responses, list_responses_by_respondent, and list_response_decisions.

The MCP layer exposes 129 tools across 25 categories. Eleven L3 tools and publish_form are protected by HMAC-signed confirmation_token gates with a 5-minute TTL, so AI clients cannot accidentally make irreversible decisions on past records.

The dashboard, when it shows numbers, shows recall-shaped numbers: how often were past responses queried, how often were old responses surfaced into a workflow, how often did the team revisit a respondent's history.

None of this prevents the product from helping with the intake side. It just stops the intake from being the only side that gets investment.

What I am still not claiming

I do not claim "recall is easy to measure." It is not. The internal definition of recall has evolved several times. The cleanest version I have today is "an operator-initiated read or write against a response that landed more than 24 hours ago, counted weekly per operator." That is rough. I keep refining it.

I do not claim recall is the only metric I look at. I look at submit volume too, because operators care about it, and a product that does not help operators collect submissions reliably is not a form product. I just stopped using it as the north star.

I do not claim chat is faster for every task. Dashboards are still the right surface for visual scanning. FORMLOVA keeps a dashboard for inspection, billing, and team management.

I do not claim "zero LLM cost." The spam classifier on the response is a small server-side model (about $0.0002 per response, opt-in per form). Everything else, including the four-message status update above, runs in the user's MCP client, not on FORMLOVA's servers.

The metric swap is not a magic trick. It is a way of pointing the product at a different center of gravity.

The honest summary

Submission count is a vanity number that resets every quarter.

Recall rate is a slow number that compounds every year.

If you are building anything where the data you collect will still be around long after the moment it arrived, you should at least know which one you are growing.

I am growing the slow one.

The form is the intake.

The record is the product.

Submit is the moment.

Recall is the asset.