Report
Restaurant Review Prediction Project Using Machine Learning
In this project we will work with a real data set provided by Yelp
narrativetext.co
The restaurant industry is tougher than ever, with reviews on the Internet from day one of opening a restaurant. But as a food lover, you and a friend decide to enter the industry and open your own restaurant. Since a restaurant's success is highly correlated with its reputation, you want to make sure it has the best reviews on the most consulted restaurant rating and review site: Yelp!.
While you know your food will be delicious, you believe there are other factors that influence the Yelp rating that will ultimately determine the success of your business. With a dataset of different restaurant characteristics and your Yelp stars, you decide to use a Multiple Linear Regression model to investigate which factors most affect a restaurant review and predict the number of stars on Yelp for your restaurant.
In this project we will work with a real data set provided by Yelp. We have provided six files, which are listed below with a brief description:
This is a large amount of data (approx. 200,000 data). The idea of this challenge is that you can simulate a real project environment.
yelp_business.json: establishment data related to the location and attributes of all the companies in the dataset.
yelp_review.json: metadata of the ratings per company.
yelp_user.json: user profile metadata per company
yelp_checkin.json: online billing metadata per company
yelp_tip.json: tips metadata per company
yelp_photo.json: photo metadata per company'
Note: as you can see the data is in .json, a different format than .csv, but don't worry, it's the same when importing and working with it, we'll show you how, go ahead!