I built a $5/1k-listing data product from two messy real estate portals

Hey IH,

I’ve been building a small B2B data product in a niche that is much messier than I expected:

commercial real estate listings.

The workflow I kept noticing is not glamorous.

It looks like this:

open LoopNet
open Crexi
search the same market twice
copy listings into a spreadsheet
remove duplicates
compare price / cap rate / sqft
check who the broker is
save the source links
repeat later
That is the whole pain.

Not “AI.”
Not “next-gen real estate intelligence.”
Not a huge dashboard.

Just two portals, messy rows, duplicate listings, missing fields, and a spreadsheet someone does not want to rebuild again.

So I built a small Apify actor around that workflow.

The idea is simple:

take public LoopNet + Crexi results and return one cleaner market file.

The current pricing is roughly:

~$5 / 1,000 listings
That pricing is part of the product thesis.

I’m not trying to replace a full enterprise CRE platform. That would be dishonest and unrealistic.

The smaller wedge is:

can a broker, analyst, or data team get a useful first-pass market file cheaply, before doing deeper research?

That felt like a better MVP than building a full SaaS UI on day one.

What I underestimated
At first, I thought the hard part would be collecting listings.

It wasn’t.

The harder part was making the output believable.

Because in a vertical data product, a row is only useful if the user understands what it means.

For example:

Is this listing from LoopNet, Crexi, or both?
Is this a sale listing or lease listing?
Is this property duplicated across platforms?
Is the cap rate declared, implied, or estimated?
Is the broker phone/email missing because of a bug, or because the source did not expose it?
Is the listing detail enriched, search-only, or unavailable?
Can this be exported into Excel, Sheets, a CRM, or an API workflow without cleaning it again?
That is where the product started to become more than “collect data.”

The actor now returns fields like:

source links
transaction type
asset class
asking price / rent fields
cap-rate context
days-on-market context when available
broker name / company when available
duplicate signals
also_listed_on
enrichment_status
data_quality_notes
Those last two fields ended up mattering more than I expected.

If a field is missing, I want the output to explain why as much as possible.

Because “blank cell” is scary in data products.

It can mean:

not available
not supported
blocked
not enriched
bad parser
wrong source
Those are very different things.

Why I used Apify instead of building a SaaS first
I chose Apify because I wanted to test the workflow before building the whole company around it.

Apify gives me:

hosted runs
datasets
scheduling
API access
exports
billing
marketplace discovery
That lets the first version be just:

input filters -> run -> clean CSV / Excel / JSON / API
No login system. No custom billing. No dashboard. No onboarding flow that might hide whether the data itself is useful.

Just the core question:

will people pay for the clean file?

The current actor is here for context:

https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od

What I’m trying to validate now
The first version is live, and the early traffic is teaching me that the positioning matters a lot.

If I say:

LoopNet + Crexi data product
devs understand it.

If I say:

clean first-pass CRE market file
brokers probably understand it better.

Same product, different mental model.

That is the part I’m still figuring out.

I think the next proof asset should probably be market-specific, not generic.

Something like:

1,000 Dallas CRE listings
source links
duplicate signals
cap-rate context
broker/company fields
exportable CSV/API
total cost: around $5
That feels more concrete than another feature list.

My current lesson
For small B2B data products, the MVP might not be a dashboard.

It might be a better spreadsheet.

If someone is already rebuilding the same file every week, the product can start as:

cleaner rows
better provenance
clearer missing-data notes
cheaper access
faster export
Then the market can tell you whether it wants an API, a SaaS UI, a done-for-you report, or something else.

That is what I’m trying to learn here.

Question for IH:

If you were evaluating a niche B2B data product like this, what would make you trust it fastest?

a public sample dataset by market
a side-by-side “manual workflow vs clean file” demo
a transparent output schema
a short video showing input -> run -> export
a case study like “1,000 listings for ~$5”
Also curious:

Have you ever turned a repeated spreadsheet workflow into a paid product?

Say something nice to KazKN…

1

I’m intentionally avoiding the “full CoStar competitor” positioning.

That is not credible for a small actor, and it is not the actual wedge.

The narrower promise is:

public LoopNet + Crexi listings -> one cleaner first-pass market file, around $5 / 1,000 listings.

That feels more honest, and probably easier for users to trust.

KazKN

·
2 months ago
·