5 | P age b ANALYSIS AND RESULTS Initial Data Preparation For this project, we have used the Data of The database and Information Systems Laboratory of University of Illinois which is openly available on the site ( http://times.cs.uiuc.edu/ wang296/Data/ ) The data was downloaded in JSON format and was converted to Comma Separated Value (CSV) format. Data represented online reviews and ratings extracted from the TripAdvisor website from 2009 to 2018 for 15 hotels. Before importing the data into R, we have created two new columns named Year and Month that captures the recency of data (2018 being the most recent. In the excel as shown below, we have filtered the data and have only considered reviews from 2017 and 2018 so that we do our analysis only on the recent data which reduce our rows from 42,431 to 14,716. Post which we have imported the data into Rand aimed to prepare a clean data which can be used for further analysis.