Define properly NextlinkSelector to scrap data from multiple pages

JFK · May 15, 2018, 3:49pm

Hi !

I’m trying to scrap data from multiple pages but UI path seems to stay on the first page of my results.
The amount of results may vary depending on the period the bot is looking for so i have to found a dynamic way to grab all results…

I already searched with UIexplorer and also in the html code without any result.
It seems like it cannot find the selector to move to the next page and i’m kind of blocked as i don’t understand where i make a mistake.

Did anyone already experienced this ?

Here is the site where i try to scrap data from:
Curia

For the selector to navigate through the pages i tried several options but i never get any result. I think the most accurate must be that one:

<html title='CURIA - List of results' />
<webctrl parentid='mainForm:*' parentname='mainForm' idx='1113' />

Thanks in advance !

Rammohan91 · May 15, 2018, 5:14pm

Hey @JFK,

I was able to scrap the data from both the pages of the link that you provided.

I used below element to move to the next page. Do you mean that robot doesn’t click on this element when you run it?

Am i missing something here?

Thanks,
Rammohan B.

JFK · May 15, 2018, 5:41pm

Hi @Rammohan91,

Yeah indeed, even if i write the code in the NextlinkSelector it seems like the bot doesn’t go further and only take the data from the first page. I tried with the “usual ui selector” from the wizard and with self made code.

Normally the code i wrote should refer to the same arrow as you, i really don’t understand what the problem is. I even checked the amount of data to scrap and put 0 as value to make sure it would take everything.

Did you took my code for the selector ? I suppose i miss something there…

Could you share me your code ?

Thanks !

Rammohan91 · May 15, 2018, 5:50pm

No. I just followed data scraping wizard.

Here is my workflow.

Curia_Test.xaml (8.6 KB)

JFK · May 15, 2018, 6:18pm

@Rammohan91 i got it !

Apparently in chrome it’s working properly, the issue probably comes from internet explorer i had to use by default…

Thanks a lot ! I’ll see if i there is a workaround otherwise i’ll switch to chrome for all my bots

JFK · May 16, 2018, 12:28pm

It seems i’ve been too fast…

Apparently the selector for the data scraping is changing depending on the search performed.

The arrow to the next page can be define like this in the link i provided:
<webctr class='btn_pagination' parentid='mainForm:j_id269' />

The problem is that if the search criteria change the parentid will also change… I then tried to use that selector:
<webctr class='btn_pagination' parentid='mainForm:j_id*' />

But then the problem is that the bot cannot make any difference between all the navigation arrows…

Does anyone got something similar ?

Thanks !

I found a solution, apparently writing on the forum helps to think out of the box

I answer to myself in case it could help someone else.

It’s important to use exclusively the uiexplorer when trying to determine a selector. I thought it was not possible to use the html title and that the elements are limited to the class, id or “balise” but other properties fit as wel !

Nevertheless the syntax is not totally the same so it’s important to pass trough the UI explorer. If the ID of the selector vary, there are other elements that must be unique like their name or title that’s displayed on screen and that we can find back in the html code.

It’s then possible to isolate the selector based on that by searching those properties in UI explorer to make sure there are correctly written.

Topic		Replies	Views
Data Scraping NextLinkSelector Studio selector , activities , data_scraping	8	1192	March 26, 2021
Extract Structured Data NextLinkSelector help Help selector , uiautomation , studio , data_scraping , question	2	937	October 20, 2020
Move to next page in data scraping Help	15	12458	November 17, 2018
Data Scrapping on next page selector Help uiautomation , studio	4	948	January 27, 2020
Pagination problem due to index on web scrapping Studio	8	1512	June 23, 2020

Define properly NextlinkSelector to scrap data from multiple pages

Related topics