Fixing broken scrapers instead of working on my actual product. So I made it my problem.

by lukem121

I run a small bussiness that pulls social media data. Follower counts, video stats, engagement ect. From the start I handled the scraping myself. Proxies, session management, dealing with platforms changing things every other week.

It worked fine for a while. Then something would break, I'd spend all day fixing it, when I should be focusing on marketing.

I figured, I'm not the only person dealing with this. Anyone building social media tools, analytics dashboards, influencer platforms, content trackers — has this exact same problem. The scraping isn't their business either. It's just a thing that needs to work so they can get on with what actually matters.

So I built Social Fetch. One API, 8 platforms (Instagram, TikTok, YouTube, X, Threads, Pinterest, Snapchat, Reddit), consistent response format. Hit an endpoint, get your data, move on with your life.

2 months in:

312 users
121K+ requests served
~1,500/day right now
Under 3 second average response time
Most used: TikTok video data and Instagram posts

The users are pretty much exactly who I expected. People building tools where social data is an input, not the output. Nobody's here because they can't figure out web scraping. They're here because they figured out it's not worth their time.

It's at socialfetch.dev. Would genuinely appreciate feedback from anyone building in this space. What's missing, what would make it more useful, what would get you to switch from whatever you're currently doing.

lukem121

on May 25, 2026

Say something nice to lukem121…

Post Comment

1

This is way too relatable 😅 feels like it’s always the side problems that eat most of the time

I’ve been trying to focus more on the core problem instead of polishing everything around it, but it’s surprisingly hard

Did you end up just ignoring the scrapers or fixing them eventually?

PlaySage

·
3 hours ago
·
Reply
1

The thing that most people don't realize until they've been doing this a while is that reliability isn't about one thing — it's about the whole chain from proxy quality to TLS fingerprint to how you handle the escalation when a platform upgrades their bot detection mid-request. I've lost count of how many scraper services I've tested that work beautifully on day one and then quietly start returning stale data when Cloudflare changes their challenge flow.

One thing that would make me trust Social Fetch more as a potential customer: transparency about what happens when a platform pushes an anti-bot update. Do you have automated monitoring that detects response shape changes? What's the typical turnaround when something breaks? A public status page with per-platform health would go a long way, especially for people who've been burned by this before.

312 users in 2 months is solid though. The fact that you built it because you lived the problem yourself is the strongest signal you've got.

seobotdk

·
5 hours ago
·
Reply
1

The scratch-your-own-itch angle is the right one, and 312 users in 2 months says the wedge works. Where I'd push next: look at which specific tool categories your top 10 paying users actually build (influencer platforms, content trackers, analytics dashboards). Whichever is overrepresented, that's your wedge. Write one case study per category and put them on the homepage. Right now the marketing reads as "we serve everyone who needs this," which is invisible to anyone specific. Pick the 2 categories driving 80% of usage and own those first.

GregoryScottHenson

·
5 hours ago
·
Reply
1

Waitlist/signals: muzi.studio/leadpipe.html?utm_source=indiehackers&utm_medium=comment&utm_campaign=show-ih

muzili88

·
7 hours ago
·
Reply
1

I feel this pain deeply. I built a lead hunting system that monitors 13 platforms with 98 cron jobs — and spent way more time fixing broken scrapers than building my actual product.

The worst part? Each platform changes their HTML structure randomly. Reddit, Fiverr, Upwork, X, Zhihu... they all break differently.

I ended up building an AI intent classifier that categorizes leads into 5 tiers, with a staging queue for human review before auto-responding. 93% of responses get auto-composed but I still manually review high-value leads.

Considering packaging this as a SaaS (working title: LeadPipe). Would anyone else find this useful?

muzili88

·
7 hours ago
·
Reply
1

Last Friday I lost an entire planned product day to exactly this kind of trap. On my own iOS side project — a tiny Captio-style memo app — a Mail API quirk only broke for users on certain carriers, and I spent six hours diffing SMTP headers when I was supposed to be writing onboarding copy. What pulled me out: I started keeping a "not my product" list. Anything that ended up on it for a second time got either deleted, outsourced, or boxed into a fixed Friday-afternoon maintenance hour. How are you handling Social Fetch support when a platform breaks scraping for everyone at once — do users mostly ping you direct, or have you built a status page that absorbs the first wave?

memolife23

·
7 hours ago
·
Reply
1

Classic yak shave that accidentally becomes the real product. The side tool you built for yourself usually has the most genuine PMF — you were the perfect customer from day one.

worvi26

·
8 hours ago
·
Reply
1

Mine: goffer.ai (Congress research + Gmail integration) -> Notion (organizing) -> Zapier (Slack routing). The goffer.ai webhook + Zapier combo is what makes it click - alerts route to the right channel without manual triage.

3vo

·
10 hours ago
·
Reply
1

I did the same thing with investing. Got tired of how confusing everything was as a beginner so I just built the tool I wished existed.
312 users in 2 months is solid, congrats. How did you get your first 50?

mrguyinvests

·
14 hours ago
·
Reply
1

I like the UI. The niche angle of social media is good too. This is generally a saturated market, and I can see your site has potential. I'm curious how you decided on the pricing model of credits? I like it as a customer, so that's a positive, but i'm just not sure it is the best way to sustain the business.

vbuser2004

·
18 hours ago
·
Reply
1

Three Saturdays in a row I lost the whole afternoon to a single Instagram parsing bug inside an optional "share-from-IG" flow on my tiny solo iOS memo app. That ratio — half my Saturdays burned on a feature roughly 4 users had ever touched — was what finally made me kill the integration entirely instead of fixing it again. Your line "they figured out it's not worth their time" is exactly the mental gear-shift I had to make, just from the buyer side. Two questions if you don't mind: did you intentionally cap at the 8 platforms for MVP, or did some get cut after building? And how are you handling the inevitable TikTok structural changes — manual selectors with on-call rotations, AI-generated parsers, or something else? Curious which one actually scales for a one-person team.

memolife23

·
19 hours ago
·
Reply
1

This makes a lot of sense.

Feels like one of those problems where people don’t actually want to “build scraping infrastructure” — they just want reliable data so they can focus on their real product.

Interesting to see TikTok and Instagram being the most used already. Curious how much maintenance is still required behind the scenes to keep everything stable.

indieworkflow

·
21 hours ago
·
Reply
1

It's interesting that you've taken on the scraper maintenance as a problem to solve, rather than just a necessary evil, and are now considering offering it as a solution to others. This approach could potentially save other businesses a significant amount of time and resources, allowing them to focus on their core product. Can you elaborate on what specific pain points you've identified in the scraper maintenance process that you think your solution could address?

Propfirms

·
21 hours ago
·
Reply
1

It sounds like you've reached a common pain point for many founders who handle data scraping themselves, where the maintenance and upkeep of the scrapers can become a significant time sink. By recognizing this as a problem worth solving, you may be able to create a more scalable and efficient solution for your business, potentially even turning it into a competitive advantage. Can you elaborate on what specific challenges you're facing with the scrapers and how you plan to address them in a more sustainable way?

Propfirms

·
21 hours ago
·
Reply
1

It sounds like you've experienced the classic trap of being too close to the technical details and losing sight of the bigger picture, which is growing your business through marketing and product development. By recognizing this and looking for solutions, you're taking a big step forward - have you considered outsourcing or automating the scraping process to free up more time for high-leverage activities? What specific solutions are you exploring to address this issue and reclaim your focus on core business goals?

Propfirms

·
a day ago
·
Reply
1

It sounds like you've been dealing with the classic problem of technical debt in web scraping, where the time spent maintaining scrapers takes away from core business development. I'd be interested to know what specifically prompted you to try to solve this problem once and for all, rather than just continuing to iterate on your existing setup. Have you considered using cloud-based scraping services or outsourcing this aspect of your business to free up more time for marketing and product development?

Propfirms

·
a day ago
·
Reply
1

Nice project, curious how you got your first users?

Juancho

·
a day ago
·
Reply
1

I completely understand the struggle of dealing with broken scrapers, it's a never-ending battle. What I've found helpful in managing these issues is implementing a modular architecture for my scrapers, allowing me to easily swap out broken components and update dependencies as needed.

To save myself from getting bogged down in scraper maintenance, I've also developed a system to run automated health checks and alert me when something is amiss. This way, I can prioritize fixing issues before they become major problems. On a separate note, I've found success with automating my marketing outreach using a simple local bot system that runs campaigns on autopilot - if you're interested, just search for "botsyst" on Google and it should be the first result.

fgsyst

·
a day ago
·
Reply
1

y running a small fleet of 26 marketing bots oI totally get where you're coming from - maintenance can be a major time suck, especially when it feels like the platforms you're scraping are constantly changing the game. One approach that might help is to try to automate the maintenance process as much as possible, so that you can set it up once and let it run in the background. This can be especially helpful if you have a large number of scrapers to manage. By the way, I've found that outsourcing marketing outreach to be a game-changer for my own business - I've set up a local system of 26 bots that handle Reddit, Twitter, and email campaigns on autopilot, with no monthly SaaS fees. It's been a huge time saver and has allowed me to focus on developing my product.n a serverless platfo

fgsyst

·
a day ago
·
Reply
1

A lot of indie founders are building great products…
with terrible distribution.

You don’t just need a product.
You need attention.

I help founders grow with content, engagement, and social media marketing on Fiverr.

Evergreenbookclub01

·
a day ago
·
Reply
1

Nice idea. Sounds amazing.

Evergreenbookclub01

·
a day ago
·
Reply
1

"Your idea sounds great! I’m also starting to build a website to list products for sale. I’m new to Indie Hackers and really eager to learn from the community. Thanks for sharing Social Fetch — it looks super useful for anyone working with social media data!"

Stylair_founder

·
a day ago
·
Reply
1

You productized the right pain. Anyone running anything in the social media space has spent two days a week patching scrapers when they should be shipping. From SocialPost.ai we feel this constantly. The real moat in this category is not the endpoint coverage, it is reliability when platforms ship breaking changes overnight. Two questions: how fast do you typically patch when TikTok or Instagram move, and do you publish a status page or changelog where developers can see it? Reliability under pressure is what makes infra businesses stick. 1,500 requests per day is a real signal. Keep going.

GregoryScottHenson

·
a day ago
·
Reply
1

Haven't dealt with this myself yet, but I'm sure the day
will come. How do you handle sudden platform changes?
That must happen a lot, right?

VictorFortuna

·
a day ago
·
Reply
1

This fantastic new world of AI delivers things that 3 or 4 years ago everyone would have considered impossible, and in an easy way. We just have to identify a real problem that exists in the market and that is not yet being fully explored by a tool that delivers a solution capable of solving it. After that, it's just a matter of continuing to pull the strings until the day that won't even be necessary. Good luck with what's to come...

JoaoPaulo

·
a day ago
·
Reply
1

The positioning is clean. "Scraping isn't your business, it's just a thing that needs to work" is exactly the right framing for B2B infrastructure.

The thing I'd think about: your biggest churn risk isn't competition, it's platform changes. instagram breaks for 48 hours, users blame you even if it's not your fault. how you communicate during outages probably matters more than most features right now.

Also curious, are those 312 users mostly free tier? The requests-per-user ratio would tell you a lot about who's actually building vs. just exploring.

Btw, building something on the same logic for Android testing, BetaSwap, credits-based system so devs stop chasing testers manually. It is free. betaswap app

BetaSwap

·
a day ago
·
Reply
1

This actually makes a lot of sense.

I think the strongest part is that you’re not really selling “web scraping,” you’re removing a problem people don’t want to spend their time thinking about anymore. Most founders building analytics or creator tools probably don’t wake up wanting to manage proxies and fix broken scrapers every week 😭

Also 121k requests in 2 months is honestly a pretty strong signal for something this infrastructure-heavy. The positioning feels clear too — social data is the input, not the product itself.

One thing I was curious about while reading this: have users cared more about reliability/stability or the number of supported platforms? Feels like most people would rather have fewer platforms that consistently work than a huge list that breaks often.

Cool build though. Definitely feels like one of those “painkiller not vitamin” products.

Stoner32bit

·
a day ago
·
Reply
1

Really solid idea. You took a problem you hit yourself and turned it into something a lot of people clearly struggle with. Would love it if you shared more about how the user growth happened.

sexiong306

·
a day ago
·
Reply
1

the consistent response format across 8 platforms is the hardest part of this to get right and also the part that breaks most often. tiktok and instagram don't expose the same fields and when one platform changes their structure your normalized response either breaks or silently drops data. how are you versioning the API so existing integrations don't just stop working when that happens

adin_builds

·
a day ago
·
Reply
1

Just hit 1.5k users on a niche tool that lets you preview SERPs from any device or location. Got 100 signups from a single Hacker News comment, which taught me that distribution beats perfection. If you’re building in public, share your failures too—people love the behind-the-scenes. My tool is SERPSpur, but the real win is the community feedback.

EvelynCarter

·
a day ago
·
Reply
1

Everyone here is hitting the hard-break case (selector dies, returns 0 rows, you get paged). The nastier one when you normalize 8 platforms behind a single schema is the silent semantic break: a platform quietly changes how it computes a metric, the scrape still returns a number, and your consistent response format hands users a value that's subtly wrong with no alert because nothing technically broke. I got burned by this on the consuming side once, building reporting for weeks on an "Active Customers" field that actually meant monthly actives, not paying users. For a product whose whole pitch is "stop thinking about social data", per-field provenance and a changelog of metric-definition changes feel like core features, not nice-to-haves. That drift is exactly the thing your users can't catch on their own, which is the reason they're paying you instead of scraping it themselves.

jixter_apps

·
a day ago
·
Reply
1

this resonates - i run ~18 scrapers against third-party pages for my thing and the maintenance tax is real. few things that cut my fix-time a lot: hash the raw response but diff the PARSED list separately (most "breaks" are just cdn/timestamp noise, not real changes, so you stop chasing ghosts). alert loudly when a parser returns 0 rows - 0 is almost always a layout change, not an empty page. and keep a "last good" snapshot per source so you can tell "they changed their html" from "we broke." putting it behind one api like you did is the right call - the scraping was never the product. nice traction.

boussettah

·
2 days ago
·
Reply
1

The scraping isn't their business either.’ That single sentence defines the entire micro-SaaS boom. We build tools because we love solving specific user problems, not because we want to spend our weekends manually rotating residential proxy nodes.
Averaging under 3 seconds across 8 platforms at 2 months in is phenomenal execution. To get developers to migrate from their custom-built spaghetti code, focus on showing them a side-by-side comparison of the time saved on maintenance. When a builder realizes they can replace 500 lines of fragile scraper logic with a single endpoint, they'll switch instantly. Awesome ship!

Eva_NomadOS

·
2 days ago
·
Reply
1

This is exactly the kind of SaaS that usually wins — solving a painful, recurring infrastructure problem that builders are tired of maintaining themselves.

The strongest part of this post is:
“The scraping isn't their business either.”

That’s the real value proposition.

Most teams can build scrapers. They just don’t want to spend engineering time fighting anti-bot systems, broken selectors, proxies, rate limits, and platform changes every week.

312 users and 121K requests in 2 months is strong validation for a developer-focused product. Also smart move keeping the API response format consistent across platforms — that alone removes a huge amount of friction.

One thing that could make this even stronger:

Webhooks for monitored account changes
Historical engagement tracking
Reliability/status metrics per platform
“Fallback freshness” indicators when scraping partially fails

Really solid founder-market fit here. You built the tool because you personally hated the problem first. That usually matters more than people realize.

https://teams.live.com/l/invite/FAAk3iOSJkDyS11JQE?v=g1

topstar

·
2 days ago
·
Reply
1

Tried doing the equivalent myself and it broke for me badly. On my own indie iOS memo app — a tiny one-tap email exporter, my Captio replacement — the non-core time sink wasn't scraping, it was email deliverability. I rolled my own SMTP for 6 weeks because "how hard can it be." What actually broke: Gmail started silently bouncing my outbound because my SPF record was valid but my DKIM rotation script had an off-by-one bug. 3 days lost on mail headers when I should have been polishing the capture flow. Switched to a transactional provider — the time-back math was embarrassing. The framing "they figured out it's not worth their time" is the real product, not the API. Every founder dealing with non-core infra remembers the exact week they gave up trying to own it themselves.

memolife23

·
2 days ago
·
Reply
1

Interesting. I noticed that when i click on Pricing, it goes nowhere.
It just keeps spinning and spinning and then it turns to "Something went wrong!" network error.

I think id be great to fix that.

backend_dev

·
2 days ago
·
Reply
1

This is a strong wedge because you are not selling “scraping” as the product. You are selling reliability for teams where social data is just an input they need to stop thinking about.

The traction also makes the pain feel real: 312 users, 121K+ requests, and people using it because maintaining scrapers is not worth their time. That is a much stronger positioning angle than “one API for 8 platforms.”

The one thing I’d pressure-test early is the name/domain frame. Social Fetch is clear, but it also sounds like a lightweight utility. What you are describing is closer to backend data infrastructure for social analytics, creator tools, influencer platforms, and content intelligence products.

Davoq .com would fit that direction better if you want the product to feel like serious infrastructure, not just a fetcher script. The current product is already doing the hard operational work behind other companies’ tools, so the brand should probably carry more technical trust before bigger users evaluate it.

Especially with a .dev domain and a reliability-heavy API, the first impression matters. The product does not need a different direction, but it may need a stronger infrastructure shell around the same thing.

aryan_sinh

·
2 days ago
·
Reply