I’m trying to use data scraping for multiple pages NOT connected by a “next page” button. To access these pages, I have to click through a series of links containing their names, so I’ve automated that process separately using a for-each loop. Within each for-each loop, I then run a data scaping to acquire what I need and then append it to a csv.
The problem is, when I try to run this, I can successfully scrape the data from the first page, but then it fails to scrape the data from the second. The issue seems to be the data scraping itself, since I’m able to get to the append part of the sequence, where it tells me I’m appending null.
I’ve already confirmed that reusing the same XML for the ExtractMetaData works. I’ve tried data scraping individually using the same XML for both pages, and it works fine. The problem arises when I try to do this in a for-loop. Why does the second data scraping fail?
When you scrapping the data and did you notice a popup saying spanning multiple pages click on Yes and click on page number and have you set maximum no of results as 0
Unfortunately, I cannot use the next page button to do this, because the issue is not with the data spanning multiple pages necessarily, but that I have to go through a series of mouse clicks to get to each data. For context, to get the data, I have to access links that look like this:
And then data scrape the table that shows up in each link:
As you can see, while there are multiple pages of data, they are not connected by a next button. So the only way for me to access each table is to use a for-loop.
Is the application successfully navigating to the other pages and just not scraping the data?
You may need to double-check the selectors on the scrape activity to ensure there’s nothing hard-coded referring to the first page. If there is, you may need to replace it with wildcards.
Also, are you able to use “Get Attribute” on the links in the “Name” column (first image) to get the URLs for each individual page in the second image?
UiPath is successfully navigating to each of the pages, and I’ve tested this, as if I leave out the datascraping method, it navigates fluidly through every link.
I’ve made the sure the selectors are are dynamic as possible, although I can’t seem to change it - it seems to be the exact same dynamic selector as the one I have for the attach browser above it, which works for the multiple browser windows. I can only edit the ExtractMetaData XML.
I think I can get the URLs for each “Name,” (I’m not sure because this is not in a regular “website” but a javascript platform) but the problem is that sometimes the number of links will change, so I need a system that reliably accesses each link without hard coding how many or what the links are. I asked elsewhere to resolve this issue, and the only option I could find was using data scraping to create a variable-size datatable and then iterating through that. That’s where the for-loop comes into play.
Here is the selector I use outside of the for-loop, in an individual sequence of its own, to extract the data from the second table. It works for said second table.
Just for the heck of it, would you mind sharing a screenshot in UI Explorer to see if there’s potentially a different combo of fields we can use to identify the table?
I’ve tried clearing the data table at the end of each loop, but that did not work. I don’t know if I need to use a new variable each time for the datatable, or if using the same variable in the loop works fine.
is it possible for you to check how many rows of data it writing for the first and second combined? And also you are trying to write the data into the excel sheet for every iteration? and are you trying to make the datatable variable to nothing before reading the second page?
I don’t think it is necessarily failing to append the data to the datatable variable. The sequence is able to get to a point where it tries to append the variable to a CSV, where it breaks and says I’m appending null. The problem seems to be that the data scraping is not collecting any data.
Here is the XML for the data scraping in the for-loop. I actually copied and pasted this exact XML into the standalone data scraping used to extract data from the second table, and it worked just fine.
Hi Jeffrey,
I was facing a similar issue and I was able to solve the issue by making the selector dynamic at the action of “attach browser”. The issue seems to be not with the selector for the data scraping activity but in the attachment of the browser in the for each row loop.
The form i found can to use Data Scraping in a For each loop without to bring the data from first page is to use the Clear Data Table Activity after each Iteration and to use here the Output data table from DataScraping, this form we can delete the data after iteration for overwritter