I just launched a powerful and fun-to-use Web Scraping API! 🚀🌔

by nyku

Hi Indie Hackers,

I’ve been working on page2api.com for the last 2 months, and today I’m making the first public announcement - It’s LIVE!

But wait, did I just built Yet-Another-Web-Scraping-API nobody asked for?
That’s a really good question, my fellow Indie Hacker!

Well, the idea of building an API for scraping the web wasn’t new, even ten years ago.
Today, It’s a crowded, competitive market with plenty of APIs that can offer a lot of features.

So, how is page2api.com different?

Before designing this API, I was scraping the web from time to time, and 10 out of 10 - I was building my scrapers to get the data I need, and even knowing that there is plenty of APIs available, I couldn’t see myself using any of it.

Why?
Because It was much easier to write my own scrapers from scratch every time, rather than using something ready.

Reason - the way those APIs are built.

What is their main problem?
They are not intuitive and easy to use.

So, I’ve made the following commitment:
I must build a Web Scraping API that will satisfy the following requirements:

Intuitive and powerful API (easy and fun to use)
Asynchronous scraping (long-running scraping sessions)
Scheduled scraping (automatic scraping on schedule)
Javascript rendering (interact with js-rich pages)
Custom browser scenarios (execute custom JS, handle pagination, and more)
Fast and reliable proxies (never get blocked)

Two months after, and I’m very excited to announce that it’s LIVE!

Enjoy!

nyku

posted to

Product Launch

on September 1, 2021

Say something nice to nyku…

Post Comment

2

Congratulations on the launch! Great landing page btw - clean and elegant.

leticiasouza

·
5 years ago
·
Reply
1. 1
  
  Thank you!
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
2

Looks really cool! I have a product where I use a lot of scraping. Going to try to find some time to try out your API. Will reach out if I have any questions.

Cous

·
5 years ago
·
Reply
1. 1
  
  Thank you!
  I can't wait to see you using this API!
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
2

Congrats! Looks very promising!

j92

·
5 years ago
·
Reply
1. 1
  
  Thank you!
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
2

How did you get the proxies? Did you buy them from a third party? Is this API good for scraping prices? Can it handle "load more button" pagination? Thanks for the answers.

Glitchero

·
5 years ago
·
Reply
1. 2
  
  same question about proxies , what are you using ?
  
  umen242
  
  ·
  5 years ago
  ·
  Reply
  1. 1
    
    Hi @umen242!
    
    At the moment I'm using the proxies that smartproxy.com provides.
    
    One subscription for datacenter proxies, and one for residential (premium) proxies.
    
    nyku
    
    ·
    5 years ago
    ·
    Reply
    1. 1
      
      Thanks you very much for answering ,
      Question :
      Did you selected this provider after doing market resource ?
      There are so many proxy providers ... how do you select the right one ?
      
      umen242
      
      ·
      5 years ago
      ·
      Reply
      1. 1
        
        https://proxyway.com - the most detailed reviews, they also have a youtube channel
        
        I know a person that has been tried all known and unknown proxy providers that ever existed, he said that the absolute champion is of course is BrightData (formerly Luminati) but I can't afford them at the moment.
        
        I came to smartproxy.com because they are the most affordable across the best-known providers.
        
        nyku
        
        ·
        5 years ago
        ·
        Reply
2. 1
  H! @Glitchero !
  
  Thank you for your interest in Page2Api!
  
  Proxies
  
  I got the proxies from a third-party provider.
  One subscription for datacenter proxies and one for Residential (Premium) proxies.
  Fun fact - the proxies costs half of all expenses for keeping the project LIVE.
  
  Scraping prices
  
  It can scrape with ease everything on the page. Prices, links, attributes, text, raw HTML.
  More than that, you can set a scheduled scrape and receive the data automatically via a webhook (callback URL) directly in your application.
  
  Load more buttons (pagination)
  
  It can handle pagination and fetch all the data at once.
  
  It can fetch a specific number of pages or fetch all pages (max 20 at the moment) until a stop condition that you set is being satisfied.
  
  You can try it by yourself in the Live Demo and paste the following payload:
  scrape 3 pages of hackernews posts
  nyku
  
  ·
  5 years ago
  ·
  Reply
2

Pretty nifty

tholloday

·
5 years ago
·
Reply
1. 1
  
  Thank you!
  Please let me know if you plan to use the API and have any questions
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
2

I played with your API in the live demo, and it is really delightful. Without even reading the docs I was able to do some scrapping for fun on this same page. :)

sauloantuness

·
5 years ago
·
Reply
1

@nyku my challenges with scraping in the past has been breakages when the underlying site changes their structure or html.

Is there any kind of automated testing or alerting available when this happens?

michaelcho

·
5 years ago
·
Reply
1. 1
  
  According to Page2API Docs, you can use the wait_for parameter that will wait until the needed element appears on the page.
  
  While it will be present - your requests will pass, but once the element stops appearing on the page because of any reasons - the request will be considered failed and it will have the success field set to false and the error field will contain an error like: 'Element 'a#article' was not found on the page.'
  
  Once you get such a response - you know that the page is broken or the structure has changed.
  
  If you are looking for a scraping API for your needs and Page2API looks like something you are interested in - feel free to contact me at any time.
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
1

I can see that the landing page can easily convince me as a developer the give it a go the next time I want to build a one-time scraper. But how would you convince me or a product manager to rely on your service in a business-critical app?
Thinking about SLA, customer service solving problems 24/7, etc.
The bottom-up approach could work for growth (dev use it at home, try to convince the boss next time a scraper is needed), but the devs would need more help.
Maybe it's out of scope for you now, I just assumed that you would want to make a lot of money out of this, and one "easy" way is to go upmarket. :)

tamasszb

·
5 years ago
·
Reply
1. 2
  
  Good point!
  
  However, right now - I'm trying to follow YC’s Essential Startup Advice
  
  TLDR: build an API that will make 10-100 developers fall in love with it, then scale it and make it reliable, then start selling it to small companies, then big ones, and so on.
  
  It's a very long trip and I have a lot to learn in the process.
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
  1. 2
    
    Thanks for the resource!
    
    tamasszb
    
    ·
    5 years ago
    ·
    Reply
1

Congratulations! will try to use this one and have a feedback

xanxan

·
5 years ago
·
Reply
1. 1
  
  Thank you!
  
  I hope you will find it useful.
  Feel free to contact me if you need any help!
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
1

Exactly what we are looking for. Can you integrate it with captcha solving services?

startfleet_io

·
5 years ago
·
Reply
1. 1
  
  Yes, it may require some time to integrate, but technically it's possible!
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
1

I've also looked into making something like this and your solution looks great. The main problem I have with web scraping for data is that it requires you to know the structure of the HTML/CSS and when that changes you have to update the scraper. What is your solution to this problem? Keep up the great work and the landing page looks sweet.

JohnGrisham

·
5 years ago
·
Reply
1. 1
  
  Thank you!
  
  Oh, the 'changing HTML structure' is the classic problem when doing web scraping.
  It sounds like a challenging and interesting problem to be solved, but in most cases - it occurs rarely enough to live with. The only recommendation is to use XPath selectors whenever there is the possibility that the classes/ids or other attributes used for styling may change.
  
  nyku
  
  ·
  5 years ago
  ·
  Reply
1

I tried for several months to grow a very similar product, even had a chrome extension for non programmers.

Good luck, it's a heavily saturated market I quit after about 9 months.

Tony

·
5 years ago
·
Reply
1. 1
  
  Hi Tony!
  
  I'm sorry that you've experienced that.
  Last year, I've also failed my first startup and had to deal with a half-year burnout.
  
  Indeed, it's a heavily saturated market.
  But for me, this means only one thing: opportunity.
  I'm very passionate about the problem I'm solving and about the customers that have this problem.
  
  Wish you good luck with your next project!
  
  nyku
  
  ·
  5 years ago
  ·
  Reply