2
1 Comment

How Would You Move Forward?

tldr: Validated idea early but having issues scraping airlines

For the last few months I have been hacking away on PointsParty.io. It's a simple idea. Scrape airlines for award flight data and alert users of great deals

I built a prototype in spreadsheets after building scrapers for 4 airlines and launched on reddit to validate the idea and boom! Huge response. It was the 2nd most upvoted post ever in a 500,000 user subreddit with 2,000+ comments.

Since then, I have had nothing but hardship trying to scrape all other airlines. Everything I try gets blocked.

  • Reproducing the specific API requests.
  • Using ISP proxies
  • Using puppeteer
  • Deploying puppeteer on lambda.
  • etc.

I know it is possible to scrape these airlines because there are other tools that do so.

What would you do if you were in my shoes?

Would you give up after not making progress for a few weeks? Giving up the glimmer of PMF I already saw.

Are there books on reverse engineering I should read?

Anything helps! Thanks!

on February 25, 2024
  1. 1
    • I'm pretty sure there is some official system that all flights are written into and major systems pay to access (IDK if it includes rewards)
    • There might be 3rd party paid API either official or someone that did the scraping for you, there are network sites of APIs like that.
    • There are specific services and tools that help build scrapers some specifically provide the access unblocking vi many proxies and would probably help you on support to figure issues out
    • Generally it's about looking like a true user, that includes speed of requests (biggest first issue for scrappers, you might need to be in 1 request per minute or event 20 requests an hour, good safe start point if you think it's rate limiting), second would be some signatures on the request that a full browser doing real browsing would render, user agent is very common, IP blocks on ranges (some would block most server farms), weird pagination could give you in, getting captcha tested and skipping/ignoring it
    • You could try to scrape an aggregator vs direct if it looks easier and the data is available
    • You can pay someone else to build that part
Trending on Indie Hackers
I'm a lawyer who launched an AI contract tool on Product Hunt today — here's what building it as a non-technical founder actually felt like User Avatar 151 comments Never hire an SEO Agency for your Saas Startup User Avatar 83 comments A simple way to keep AI automations from making bad decisions User Avatar 65 comments “This contract looked normal - but could cost millions” User Avatar 54 comments 👉 The most expensive contract mistakes don’t feel risky User Avatar 41 comments We automated our business vetting with OpenClaw User Avatar 34 comments