Scraping activity not extracting all the data

Hello,

I am trying to extract all the reviews for a product along with the user name, star rating, date and location, color and specifications, review topic, the review itself and helpfulness votes from Amazon. I am using the data scraping activity and getting all the data by selecting “extract correlated data”. Amazon shows that they have xyz number of reviews but my tool is collecting less than that. For example for this product:

amazon says that there should be 17878 reviews but my automation only collects 4518. I have also kept 1 million as my upper bound for data scraping.

Second problem is that towards the ending the data is not being scraped properly from the website and is missing first three columns and last column.

Have attached my resulting excel file and code.

what my code does:

  1. Opens an excel file which has the ASIN ID(product ID) and Link to product
  2. Goes to the review page for each product
  3. then extracts the reviews data
  4. writes to the excel file, creates a sheet for each product
  5. repeats the entire process.

Thank you

Main.xaml (14.1 KB)

Review Extraction.xlsx (2.5 MB)

1 Like

Hello @SSSS_MMMM ,

When I tried out, it was getting all the reviews. I think you are confusing the number of ratings and the number of review.


In this case, 1838 is the number of reviews and the bot was able to get all.

About the missing data, I think the its getting all the reviews from United states, but not from other countries.


While indicating the fiels in Data scraping, try to include other countries as well and see how it goes.

Hope this helps.
Thanks!
Athira

Hello,

Thank you for your reply but I ran the automation again and I am still not getting all the results, especially for the 17894 reviews for product B07TWFVDWT. I am looking at the reviews only


Can you please share the resulting data file you are getting for all the products and do you know why this is happening on my system?

Also can you please elaborate what you mean by "try to include other countries " in data scraping.

Thank you

Hello,

Here is the last review BOT captured:


After this, when you click next page, no reviews are present. This has something to do with amazon page.

And that’s why bot has captured only till then.
This is thelink to that last review. You can try clicking next page manually:

Thanks!
Athira