1,062 search impressions. 1 click. The number that broke our weekly review.

by Chris Leo

Build week 4 of Pulseboard, the marketing analytics tool I am building for service business agencies. Quick write-up of the find that landed hardest this week.

I pulled our own Google Search Console into the engine last Sunday. Over fourteen days, our agency site (movou.com) earned 1,062 organic impressions and exactly one click. The one click was on the brand query "movou."

Looking through it by query, seven of eight priority geo keywords are ranking on page one or page two:

"lithia springs seo company": position 5, twenty one impressions, zero clicks
"marin local search engine optimization": position 18, thirty one impressions
"colbert seo": position 14, fourteen impressions

The position numbers are fine. The titles are not earning the click.

The reason this is invisible in most agency stacks is structural. GSC's default report shows ranking position and CTR in different tabs. GA4 does not surface CTR at all (it lives in Search Console). Unless you are a dedicated SEO running a weekly title-rewrite ritual, you are looking at position-only dashboards and celebrating page-one rankings that are quietly bleeding traffic.

When we paired position and CTR on the same chart in Pulseboard, the "ranking-but-not-clicking" pattern lit up immediately. The recommendation we ended up writing was: rewrite titles for the seven priority queries to include city plus outcome (e.g., "Lithia Springs SEO Company. Rank in the 3-Pack in 90 Days.").

Design question I am chewing on:

When the tool spots this pattern, should it just flag the queries and let the user rewrite, or should it auto-generate three title variants from query intent and let you pick? I am leaning auto-generate because the manual rewrite step is where most agencies stall, but I am worried about producing LLM slop that ranks worse than the original.

Anyone here built CTR-aware recommendations for SEO? How did you split the heuristic vs LLM work? Where did the auto-suggestions break down on you?

Chris Leo

posted to

Productized Services

on May 19, 2026

Say something nice to ChrisLeo…

Post Comment

1
Built something close to this for a client (n8n + Claude API + GSC). Couple of patterns that survived production:
1. Don't have the LLM generate from scratch. Have it RANK.
The slop concern is real — LLM-from-scratch titles drift toward generic ("Top SEO Services for Lithia Springs Businesses 2026") and quietly lose keyword density. The fix is a 2-stage gen pipeline:
- Stage 1: heuristic template generator. For each query, produce 4-5 candidates from variables — {city} {service} | {outcome verb} in {timeframe}, {city}: {service} for {audience}, etc. Pure string interpolation, ~50 LoC.
- Stage 2: feed the 4-5 candidates + the original title + the actual ranking SERP context (from a Firecrawl /scrape of the page-one results) to Claude with a structured prompt: "Score these 1-5 on click-likelihood given the SERP context. Return the top one with a one-sentence rationale." Now you're using the LLM for judgment, not creativity.
  That's where it earns its keep.
1. Hard structural constraints in the prompt itself.
Claude responds well to explicit structured-output constraints — char count caps (≤60), required tokens (city must appear verbatim, service must appear verbatim), forbidden tokens (no "best", no "top", no "premier"). Without this, you get the "Best Top Premier Expert" adjective stack that ranks worse than the original.
1. Queue, don't auto-publish.
The 5-15 min daily review IS the moat. Auto-publish breaks the second Claude hallucinates a wrong city name (will happen — and it'll be a P0 for your agency users). Approval gate eliminates the entire "why did our title flip to something weird?" support category. I'd frame the approval queue as a Pulseboard feature, not a workaround — agencies running 30+ client SEO simultaneously NEED that batch-review UX more than they need full automation.
1. Track before-after delta on the same query.
Without storing pre-change ranking + CTR baseline, you can't tell if the new title actually worked or if a competitor's title broke. This is the single biggest "we tried it but couldn't measure it" failure on tools like this.

Direct answer to your heuristic-vs-LLM split: heuristics generate breadth, LLM picks. That inversion changes everything about your slop risk.
syednoor

·
2 days ago
·
Reply