1
1 Comment

I built a $5/1k-listing data product from two messy real estate portals

Hey IH,

I’ve been building a small B2B data product in a niche that is much messier than I expected:

commercial real estate listings.

The workflow I kept noticing is not glamorous.

It looks like this:

open LoopNet
open Crexi
search the same market twice
copy listings into a spreadsheet
remove duplicates
compare price / cap rate / sqft
check who the broker is
save the source links
repeat later
That is the whole pain.

Not “AI.”
Not “next-gen real estate intelligence.”
Not a huge dashboard.

Just two portals, messy rows, duplicate listings, missing fields, and a spreadsheet someone does not want to rebuild again.

So I built a small Apify actor around that workflow.

The idea is simple:

take public LoopNet + Crexi results and return one cleaner market file.

The current pricing is roughly:

~$5 / 1,000 listings
That pricing is part of the product thesis.

I’m not trying to replace a full enterprise CRE platform. That would be dishonest and unrealistic.

The smaller wedge is:

can a broker, analyst, or data team get a useful first-pass market file cheaply, before doing deeper research?

That felt like a better MVP than building a full SaaS UI on day one.

What I underestimated
At first, I thought the hard part would be collecting listings.

It wasn’t.

The harder part was making the output believable.

Because in a vertical data product, a row is only useful if the user understands what it means.

For example:

Is this listing from LoopNet, Crexi, or both?
Is this a sale listing or lease listing?
Is this property duplicated across platforms?
Is the cap rate declared, implied, or estimated?
Is the broker phone/email missing because of a bug, or because the source did not expose it?
Is the listing detail enriched, search-only, or unavailable?
Can this be exported into Excel, Sheets, a CRM, or an API workflow without cleaning it again?
That is where the product started to become more than “collect data.”

The actor now returns fields like:

source links
transaction type
asset class
asking price / rent fields
cap-rate context
days-on-market context when available
broker name / company when available
duplicate signals
also_listed_on
enrichment_status
data_quality_notes
Those last two fields ended up mattering more than I expected.

If a field is missing, I want the output to explain why as much as possible.

Because “blank cell” is scary in data products.

It can mean:

not available
not supported
blocked
not enriched
bad parser
wrong source
Those are very different things.

Why I used Apify instead of building a SaaS first
I chose Apify because I wanted to test the workflow before building the whole company around it.

Apify gives me:

hosted runs
datasets
scheduling
API access
exports
billing
marketplace discovery
That lets the first version be just:

input filters -> run -> clean CSV / Excel / JSON / API
No login system. No custom billing. No dashboard. No onboarding flow that might hide whether the data itself is useful.

Just the core question:

will people pay for the clean file?

The current actor is here for context:

https://apify.com/kazkn/commercial-real-estate-brokerage-intel?fpr=8fp2od

What I’m trying to validate now
The first version is live, and the early traffic is teaching me that the positioning matters a lot.

If I say:

LoopNet + Crexi data product
devs understand it.

If I say:

clean first-pass CRE market file
brokers probably understand it better.

Same product, different mental model.

That is the part I’m still figuring out.

I think the next proof asset should probably be market-specific, not generic.

Something like:

1,000 Dallas CRE listings
source links
duplicate signals
cap-rate context
broker/company fields
exportable CSV/API
total cost: around $5
That feels more concrete than another feature list.

My current lesson
For small B2B data products, the MVP might not be a dashboard.

It might be a better spreadsheet.

If someone is already rebuilding the same file every week, the product can start as:

cleaner rows
better provenance
clearer missing-data notes
cheaper access
faster export
Then the market can tell you whether it wants an API, a SaaS UI, a done-for-you report, or something else.

That is what I’m trying to learn here.

Question for IH:

If you were evaluating a niche B2B data product like this, what would make you trust it fastest?

a public sample dataset by market
a side-by-side “manual workflow vs clean file” demo
a transparent output schema
a short video showing input -> run -> export
a case study like “1,000 listings for ~$5”
Also curious:

Have you ever turned a repeated spreadsheet workflow into a paid product?

on June 12, 2026
  1. 1

    I’m intentionally avoiding the “full CoStar competitor” positioning.

    That is not credible for a small actor, and it is not the actual wedge.

    The narrower promise is:

    public LoopNet + Crexi listings -> one cleaner first-pass market file, around $5 / 1,000 listings.

    That feels more honest, and probably easier for users to trust.

Trending on Indie Hackers
I built a tool directory that doesn't pretend every founder has the same needs User Avatar 62 comments Drop your landing page URL. I'll use Ferguson to tell you why visitors might be leaving User Avatar 50 comments AI helped me ship faster. Then I forgot what my product actually does. User Avatar 37 comments I Was Picking the Wrong SaaS Tools for Two Years. Here's the Mistake I Finally Figured Out. User Avatar 33 comments Most early-stage SaaS companies miss churn signals — here’s how to catch them early User Avatar 28 comments How I Run a 1.7M Product Search Engine at 66ms on a $0 Hosting Budget User Avatar 19 comments