15
28 Comments

I just launched a powerful and fun-to-use Web Scraping API! 🚀🌔

Hi Indie Hackers,

I’ve been working on page2api.com for the last 2 months, and today I’m making the first public announcement - It’s LIVE!

But wait, did I just built Yet-Another-Web-Scraping-API nobody asked for?
That’s a really good question, my fellow Indie Hacker!

Well, the idea of building an API for scraping the web wasn’t new, even ten years ago.
Today, It’s a crowded, competitive market with plenty of APIs that can offer a lot of features.

So, how is page2api.com different?

Before designing this API, I was scraping the web from time to time, and 10 out of 10 - I was building my scrapers to get the data I need, and even knowing that there is plenty of APIs available, I couldn’t see myself using any of it.

Why?
Because It was much easier to write my own scrapers from scratch every time, rather than using something ready.

Reason - the way those APIs are built.

What is their main problem?
They are not intuitive and easy to use.

So, I’ve made the following commitment:
I must build a Web Scraping API that will satisfy the following requirements:

  • Intuitive and powerful API (easy and fun to use)
  • Asynchronous scraping (long-running scraping sessions)
  • Scheduled scraping (automatic scraping on schedule)
  • Javascript rendering (interact with js-rich pages)
  • Custom browser scenarios (execute custom JS, handle pagination, and more)
  • Fast and reliable proxies (never get blocked)

Two months after, and I’m very excited to announce that it’s LIVE!

Enjoy!

posted to Icon for group Product Launch
Product Launch
on September 1, 2021
  1. 2

    Congratulations on the launch! Great landing page btw - clean and elegant.

  2. 2

    Looks really cool! I have a product where I use a lot of scraping. Going to try to find some time to try out your API. Will reach out if I have any questions.

    1. 1

      Thank you!
      I can't wait to see you using this API!

  3. 2

    Congrats! Looks very promising!

  4. 2

    How did you get the proxies? Did you buy them from a third party? Is this API good for scraping prices? Can it handle "load more button" pagination? Thanks for the answers.

    1. 2

      same question about proxies , what are you using ?

      1. 1

        Hi @umen242!

        At the moment I'm using the proxies that smartproxy.com provides.

        One subscription for datacenter proxies, and one for residential (premium) proxies.

        1. 1

          Thanks you very much for answering ,
          Question :
          Did you selected this provider after doing market resource ?
          There are so many proxy providers ... how do you select the right one ?

          1. 1
            1. https://proxyway.com - the most detailed reviews, they also have a youtube channel

            2. I know a person that has been tried all known and unknown proxy providers that ever existed, he said that the absolute champion is of course is BrightData (formerly Luminati) but I can't afford them at the moment.

            3. I came to smartproxy.com because they are the most affordable across the best-known providers.

    2. 1

      H! @Glitchero !

      Thank you for your interest in Page2Api!

      1. Proxies

      I got the proxies from a third-party provider.
      One subscription for datacenter proxies and one for Residential (Premium) proxies.
      Fun fact - the proxies costs half of all expenses for keeping the project LIVE.

      1. Scraping prices

      It can scrape with ease everything on the page. Prices, links, attributes, text, raw HTML.
      More than that, you can set a scheduled scrape and receive the data automatically via a webhook (callback URL) directly in your application.

      1. Load more buttons (pagination)

      It can handle pagination and fetch all the data at once.

      It can fetch a specific number of pages or fetch all pages (max 20 at the moment) until a stop condition that you set is being satisfied.

      You can try it by yourself in the Live Demo and paste the following payload:
      scrape 3 pages of hackernews posts

    1. 1

      Thank you!
      Please let me know if you plan to use the API and have any questions

  5. 2

    I played with your API in the live demo, and it is really delightful. Without even reading the docs I was able to do some scrapping for fun on this same page. :)

  6. 1

    @nyku my challenges with scraping in the past has been breakages when the underlying site changes their structure or html.

    Is there any kind of automated testing or alerting available when this happens?

    1. 1

      According to Page2API Docs, you can use the wait_for parameter that will wait until the needed element appears on the page.

      While it will be present - your requests will pass, but once the element stops appearing on the page because of any reasons - the request will be considered failed and it will have the success field set to false and the error field will contain an error like: 'Element 'a#article' was not found on the page.'

      Once you get such a response - you know that the page is broken or the structure has changed.

      If you are looking for a scraping API for your needs and Page2API looks like something you are interested in - feel free to contact me at any time.

  7. 1

    I can see that the landing page can easily convince me as a developer the give it a go the next time I want to build a one-time scraper. But how would you convince me or a product manager to rely on your service in a business-critical app?
    Thinking about SLA, customer service solving problems 24/7, etc.
    The bottom-up approach could work for growth (dev use it at home, try to convince the boss next time a scraper is needed), but the devs would need more help.
    Maybe it's out of scope for you now, I just assumed that you would want to make a lot of money out of this, and one "easy" way is to go upmarket. :)

    1. 2

      Good point!

      However, right now - I'm trying to follow YC’s Essential Startup Advice

      TLDR: build an API that will make 10-100 developers fall in love with it, then scale it and make it reliable, then start selling it to small companies, then big ones, and so on.

      It's a very long trip and I have a lot to learn in the process.

      1. 2

        Thanks for the resource!

  8. 1

    Congratulations! will try to use this one and have a feedback

    1. 1

      Thank you!

      I hope you will find it useful.
      Feel free to contact me if you need any help!

  9. 1

    Exactly what we are looking for. Can you integrate it with captcha solving services?

    1. 1

      Yes, it may require some time to integrate, but technically it's possible!

  10. 1

    I've also looked into making something like this and your solution looks great. The main problem I have with web scraping for data is that it requires you to know the structure of the HTML/CSS and when that changes you have to update the scraper. What is your solution to this problem? Keep up the great work and the landing page looks sweet.

    1. 1

      Thank you!

      Oh, the 'changing HTML structure' is the classic problem when doing web scraping.
      It sounds like a challenging and interesting problem to be solved, but in most cases - it occurs rarely enough to live with. The only recommendation is to use XPath selectors whenever there is the possibility that the classes/ids or other attributes used for styling may change.

  11. 1

    I tried for several months to grow a very similar product, even had a chrome extension for non programmers.

    Good luck, it's a heavily saturated market I quit after about 9 months.

    1. 1

      Hi Tony!

      I'm sorry that you've experienced that.
      Last year, I've also failed my first startup and had to deal with a half-year burnout.

      Indeed, it's a heavily saturated market.
      But for me, this means only one thing: opportunity.
      I'm very passionate about the problem I'm solving and about the customers that have this problem.

      Wish you good luck with your next project!

Trending on Indie Hackers
How are you handling memory and context across AI tools? User Avatar 112 comments Do you actually own what you build? User Avatar 66 comments Code is Cheap, but Scaling AI MVPs is Hard. Let’s Fix Yours. User Avatar 34 comments I Think MCP Will Punish Thin API Wrappers User Avatar 27 comments What AI Is Actually Changing in IT Certification Prep User Avatar 19 comments Cloud vs Cybersecurity Certifications | 2026 Path Makes More Sense User Avatar 18 comments