#ASK IH: Legality of Amazon scraping.

I know that there a lot of services that are:

Doing proxy for scraping
Tracking product ratings, prices, and reviews
Most of them are paid services.

It is completely unclear for me:
What is the legality of scraping publicly available data from sites like Amazon and selling it as a service?
Could these companies get in trouble for doing it?

Dmitry Dryomov

on February 14, 2019

Say something nice to dremovd…

Post Comment

2

IANAL but I have been involved in a suit in this area.

One differentiation to make: There's the legality in terms of written law, and then there's the legality of contract law when it comes to being bound by a site's T&C's and User Agreement.

It makes a big (legal) difference whether your scraper requires credentials to access any of the scraped content. (Having credentials can legally bind you to the T&C's and UA). If you're scraping publicly facing stuff, then you're on much firmer ground.

It also depends on what you do with the scraped data. Obviously, if you republish it directly, there are copyright issues. If you're performing (for example) semantic analysis on ripped text-corpora or crunching (public facing) data and then providing your own analysis (ie: not directly republishing), then I think you're probably on pretty solid ground.

IMHO it's a giant gray area though. As other users here said, Google scrapes everyone. As do 1000's of SEO and website analysis tools (my current project included).

Generally speaking, if you want to be on the lighter side of gray -- you should do your best to obey robots.txt files.

In the case of Amazon -- IMHO the thing you need to worry about isn't so much "getting in trouble" as it is Amazon's army of lawyers just burying you in claims. Whether or not you're legally in the right may not really matter. They might be able to make you stop just by being big and mean. Of course, clever header spoofing, user agent switching and IP rotation might keep that day from ever coming ;)

Good luck. Your project sounds cool.

Bangkokian

·
7 years ago
·
Reply
2

It can be a thin line.

What if you were to get employees to just check amazon's website all day long?

What if you were to 'emulate' that behavior?

All google does is scrape other people's websites.

Good read on the subject: https://resources.distilnetworks.com/all-blog-posts/is-web-scraping-illegal-depends-on-what-the-meaning-of-the-word-is-is

in short, if it's not 'paid content' it's fairly hard to make a case unless your scraping practices cause a financial loss or disruption of service etc

indiehacker433

·
7 years ago
·
Reply
1. 1
  
  Super helpful article!
  
  dremovd
  
  ·
  7 years ago
  ·
  Reply
2

I presume these companies say that they just provide the stack to make these kind of scraping, but they declare that it’s the responsibility of the user to comply with the terms of the website (here Amazon).

Did you had a look to their terms of use?

Frenchcooc

·
7 years ago
·
Reply
1. 1
  
  Of course TOS trying to make it client's responsibility. However, Amazon probably would combat a service provider.
  
  Let's take datahawk.com specifically. They are tracking Amazon products day by day. It's probably the worst thing you can do :)
  
  dremovd
  
  ·
  7 years ago
  ·
  Reply