Hi,
can someone suggest me a tutorial to do scraping on multiple web pages.
The case consists of a search that results in a list of many pages. The data to be escaped is within the pages in the list. So you have to enter each page in the list and grab the data that makes up one row, then enter the second result and scrape the second row and so on.
So far I have only found tutorials that explain how to scapping with the data on a single page that already contains the data to be scrapped (like Amazon).
But how to do it if the results page does not have the data we are interested in and we have to go to each page in the list to get it?
Are you looking to have the data scraped from different pages in a single DataTable?
When you use Extract Table Data activity or the wizard to extract structured data, the output is always a DataTable. To consolidate your scraped data into a single table, you can have another DataTable variable that you always merge into using the Merge Data Table activity.
E.g.
Assign dt_ConsolidatedResults = Nothing
For Each page
Search and extract data.
If dt_ConsolidatedResults Is Nothing
Then:
dt_ConsolidatedResults = ExtractDataTable.Copy
Else:
Merge Data Table with Source as ExtractDataTable and Destination as dt_ConsolidatedResults