Extracting data from 2 different pages for

Hello,

I need to extract the votes and date the question was asked from Amazon for all the questions present here: Amazon.com: Customer Questions & Answers
The data columns and values I want is**: question, votes, date question was asked.** - Please note this data spans over multiple pages.

The date and votes for each question is on 2 different web pages. The votes and question can be extracted as is but we need to click the question link to get the date.
How should I do this? I tried using web scraping and filling each row of the excel first with question then votes but the scraper is giving issues and moving down to the next question is hard. If I use data scraping my table only stores votes and question, not the date. If I l later go and get the date then the rows are mismatched. Help will be greatly appreciated, thanks!

1 Like

Hey @SSSS_MMMM

Did you try Data Scraping?

Thanks
#nK

@Nithinkrishna Yes

1 Like

So @SSSS_MMMM what challenge you are facing when you use data Scraping technique?

As I can see you tried screen scraping which is different.

Thanks
#nK

@Nithinkrishna Overall, what I want is question, question date, question votes, all answers to question, answer hellpfullness votes, answer date, answer username. Here is my code:
the issues is:

  1. when a question has no answers the automation just freezes. For over 1 hour it was stuck on the same question because it had no answer.
    Here is what my automation does:
  2. It opens an excel file with product link and ASIN(product ID)
  3. Then from the ASIN number it goes to the main page for that product where it can find all the question and answers
  4. Then it extracts all the question links, votes, question
  5. It then goes on each question link and uses data scraping to get the answers, answer date, answer user name, answer helpfullness
  6. It uses screen scraping for getting the question date
  7. It then adds a column where it adds the question, votes, and question date for the above data(multiple answers) per question
  8. then it sets all the data tables to null
  9. Repeats the process all over again
    Here is my code:
    Main.xaml (29.7 KB)

Here is the input excel file:
Ques Ans With Votes Final 24.xlsx (8.5 KB)

1 Like

Hey @SSSS_MMMM

You are trying to do from here,

Thanks
#nK

@Nithinkrishna Yes