Hey IH,
I’ve been building a small B2B data product in a niche that is much messier than I expected:
commercial real estate listings.
The workflow I kept noticing is not glamorous.
It looks like this:
open LoopNet
open Crexi
search the same market twice
copy listings into a spreadsheet
remove duplicates
compare price / cap rate / sqft
check who the broker is
save the source links
repeat later
That is the whole pain.
Not “AI.”
Not “next-gen real estate intelligence.”
Not a huge dashboard.
Just two portals, messy rows, duplicate listings, missing fields, and a spreadsheet someone does not want to rebuild again.
So I built a small Apify actor around that workflow.
The idea is simple:
take public LoopNet + Crexi results and return one cleaner market file.
The current pricing is roughly:
~$5 / 1,000 listings
That pricing is part of the product thesis.
I’m not trying to replace a full enterprise CRE platform. That would be dishonest and unrealistic.
The smaller wedge is:
can a broker, analyst, or data team get a useful first-pass market file cheaply, before doing deeper research?
That felt like a better MVP than building a full SaaS UI on day one.
What I underestimated
At first, I thought the hard part would be collecting listings.
It wasn’t.
The harder part was making the output believable.
Because in a vertical data product, a row is only useful if the user understands what it means.
For example:
Is this listing from LoopNet, Crexi, or both?
Is this a sale listing or lease listing?
Is this property duplicated across platforms?
Is the cap rate declared, implied, or estimated?
Is the broker phone/email missing because of a bug, or because the source did not expose it?
Is the listing detail enriched, search-only, or unavailable?
Can this be exported into Excel, Sheets, a CRM, or an API workflow without cleaning it again?
That is where the product started to become more than “collect data.”
The actor now returns fields like:
source links
transaction type
asset class
asking price / rent fields
cap-rate context
days-on-market context when available
broker name / company when available
duplicate signals
also_listed_on
enrichment_status
data_quality_notes
Those last two fields ended up mattering more than I expected.
If a field is missing, I want the output to explain why as much as possible.
Because “blank cell” is scary in data products.
It can mean:
not available
not supported
blocked
not enriched
bad parser
wrong source
Those are very different things.
Why I used Apify instead of building a SaaS first
I chose Apify because I wanted to test the workflow before building the whole company around it.
Apify gives me:
hosted runs
datasets
scheduling
API access
exports
billing
marketplace discovery
That lets the first version be just:
input filters -> run -> clean CSV / Excel / JSON / API
No login system. No custom billing. No dashboard. No onboarding flow that might hide whether the data itself is useful.
Just the core question:
will people pay for the clean file?
The current actor is here for context:
https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od
What I’m trying to validate now
The first version is live, and the early traffic is teaching me that the positioning matters a lot.
If I say:
LoopNet + Crexi data product
devs understand it.
If I say:
clean first-pass CRE market file
brokers probably understand it better.
Same product, different mental model.
That is the part I’m still figuring out.
I think the next proof asset should probably be market-specific, not generic.
Something like:
1,000 Dallas CRE listings
source links
duplicate signals
cap-rate context
broker/company fields
exportable CSV/API
total cost: around $5
That feels more concrete than another feature list.
My current lesson
For small B2B data products, the MVP might not be a dashboard.
It might be a better spreadsheet.
If someone is already rebuilding the same file every week, the product can start as:
cleaner rows
better provenance
clearer missing-data notes
cheaper access
faster export
Then the market can tell you whether it wants an API, a SaaS UI, a done-for-you report, or something else.
That is what I’m trying to learn here.
Question for IH:
If you were evaluating a niche B2B data product like this, what would make you trust it fastest?
a public sample dataset by market
a side-by-side “manual workflow vs clean file” demo
a transparent output schema
a short video showing input -> run -> export
a case study like “1,000 listings for ~$5”
Also curious:
Have you ever turned a repeated spreadsheet workflow into a paid product?
I’m intentionally avoiding the “full CoStar competitor” positioning.
That is not credible for a small actor, and it is not the actual wedge.
The narrower promise is:
public LoopNet + Crexi listings -> one cleaner first-pass market file, around $5 / 1,000 listings.
That feels more honest, and probably easier for users to trust.