12
16 Comments

Why we're launching our own Web Scraper API

When I started working on Reviewshake back in 2017, I had no idea of the data challenges I'd be facing. It turns out that in order to run a review management platform, you need online review data.

The next realization I had at the time was that most review sites don't provide an API to the reviews on their website, meaning that I'd have to resort to web scraping. I launched with support for a handful of review sites, which quickly grew past 50 and the difficulties in maintaining this were very real. Around the same time, it occurred to me that others might also want to use this technology, so I launched the Review Scraper API on Product Hunt in 2018.

Since then, I've provided online review data to thousands of companies including Samsung, Deloitte, PwC through to academics at Harvard, MIT and Princeton. I grew the company to a small team and we've launched more APIs - the Local NAP API, Review Index API, and rebranded these APIs under the Datashake brand.

Throughout this growth, we've relied on a variety of data partners to help extract our data, which worked great in the early stages. Last year we started to outgrow those solutions, in terms of quality, cost and support and this is what leads us to this Web Scraper API product release today.

We simply needed the ability to scrape data with higher quality, faster speeds and better support than what we could find out there. We set about owning the next part of our supply chain, and ventured out beyond just online review data. In typical fashion, we built the technology we needed internally, and are now making this available to others.

If there is interest, I'll be sharing more details about how we built this Web Scraper API in future posts. In the meantime, we have a couple of offers for you to give us a try:

🚀 For users new to web scraping: 30,000 free credits to get you started (just respond to this thread with the first part of the email you signed up with)

🔥 For those who use our competitors: 50% off your latest invoice for the same package (limited time offer, just attach the invoice to our support)

posted to Icon for group Product Launch
Product Launch
on March 23, 2021
  1. 2

    Hey, I checked your service with my saas: https://www.poirot.app and it works perfectly. Thanks for letting us test it with 30.000 requests. My email starts with: mladen

  2. 2

    I was checking out a few web scraping APIs these last few days for my website. Yours seems to be working like a charm! If I could give you my opinion on your pricing plans: I went with browserless.io because they have a usage-based plan where they charge x amounts of credits for each requests depending on the configuration (JS rendering costs more for example). In your case if I wanted to scrape with JS rendering I would be obliged to suscribe to the $249/month plan even though I might need only 30k calls. You could structure it as a pre-pay plan where the user top-ups its account. Just a suggestion!

  3. 2

    This is awesome dude. My project ReviewBolt.com (ironically very similar name) uses a ton of web scraping APIs to create reviews of websites.

    Do you guys share your MRR? Looks like a great company. And how often do you rotate IPs?

    1. 1

      Super cool site, lots of useful information! Love that you have a public roadmap. Thanks for the support it means a lot.

      I'm afraid we don't share our MRR. The IP rotation is very specific to each site, so it's hard to give you a concrete answer :)

  4. 2

    The Web Scraper API is an awesome tool to do crawling projects at scale. I have worked on a social media crawler in the past and I would have loved to have a web crawler available as an API to help get past some of the biggest hurdles - proxy servers, captchas and others. All of these aspects have gotten more and more complex in recent years, so I'm glad that the Web Scraper API is on top of them.

    1. 1

      So glad to hear you like it, it's been a joy working with you on this :)

  5. 2

    This looks great, I'll definitely check it out. Just wondering, do you have any tips or best practices on how to do web scraping at your level of volume?

    1. 2

      For sure, too many to share in a simple comment - I'll try to share more in an extensive blog post soon. These are some factors that come to mind:

      1. Metrics: you need to monitor everything, because this is the only way you'll know what's going on and where the bottlenecks are. Scraping at scale is all about finding efficiencies, otherwise your costs will soar.
      2. Hedge your bets: In the last few years we've seen players coming and going, or providers that simply go offline without notice. We're here to stay and support is a top priority.
      3. Build or buy: this is a classic in tech - is your scraping operation a core competency, or is it something that you should outsource?
  6. 1

    I really want to give it a try. My email starts with: akshaycj999

  7. 1

    Would love to try it out. My email starts with: donkeythepooh

  8. 1

    I've always been interested in web scraping. but tell me, what are the legalities of this?

  9. 1

    Hi! I'm new and would love to try this out. The first part of my email is anand and the domain starts with L. Thanks.

  10. 1

    I was looking for a scraping service that is easy to use and yours checks that box!

    It would be great if you could offer me the 30,000 free credits. Here's the first part of the email I've signed up with: tonixx

    Thanks a lot!!!

    1. 1

      So glad to hear it, thanks for checking us out! The credits are added ;)

  11. 1

    It doesn’t support crawling/link following capabilities? Do you have to know every url you want to scrape?

    1. 1

      Yes you'd need to know every URL, and if you find a new URL you can just submit it to the API to follow it. You can check out our API docs for more info!

Trending on Indie Hackers
I spent $0 on marketing and got 1,200 website visitors - Here's my exact playbook User Avatar 41 comments Why Early-Stage Founders Should Consider Skipping Prior Art Searches for Their Patent Applications User Avatar 22 comments I built eSIMKitStore — helping travelers stay online with instant QR-based eSIMs 🌍 User Avatar 20 comments Codenhack Beta — Full Access + Referral User Avatar 20 comments Veo 3.1 vs Sora 2: AI Video Generation in 2025 🎬🤖 User Avatar 18 comments Day 6 - Slow days as a solo founder User Avatar 13 comments