I am trying to extract all the reviews for a product along with the user name, star rating, date and location, color and specifications, review topic, the review itself and helpfulness votes from Amazon. I am using the data scraping activity and getting all the data by selecting “extract correlated data”. Amazon shows that they have xyz number of reviews but my tool is collecting less than that. For example for this product:
amazon says that there should be 17878 reviews but my automation only collects 4518. I have also kept 1 million as my upper bound for data scraping.
Second problem is that towards the ending the data is not being scraped properly from the website and is missing first three columns and last column.
Have attached my resulting excel file and code.
what my code does:
Opens an excel file which has the ASIN ID(product ID) and Link to product
Goes to the review page for each product
then extracts the reviews data
writes to the excel file, creates a sheet for each product
When I tried out, it was getting all the reviews. I think you are confusing the number of ratings and the number of review.
In this case, 1838 is the number of reviews and the bot was able to get all.
About the missing data, I think the its getting all the reviews from United states, but not from other countries.
Thank you for your reply but I ran the automation again and I am still not getting all the results, especially for the 17894 reviews for product B07TWFVDWT. I am looking at the reviews only