Issues with NextLinkSelector for page 2 and onwards

Hi!

I’m trying to scrape search results from the advanced version of google patents (patents.google.com/advanced).

I’ve set up a solution that opens a browser, types in a search term and extracts the results into a data table. I used the data scraping wizard.

As I want more than the results on the first page, I’ve tried to get the wizard to recognize the “next” link at the bottom of the page. In this case I used the “>”-symbol, circled here:

image

When I marked this link in the wizard, the selector for the “NextLinkSelector” ended up as:
<webctrl id='icon' idx='2' tag='IRON-ICON' />

This only works for the first page. The solution does not proceed to page 3 and instead times out.

So I noticed that after the first results page, the structure of the page changes. It now has controls for going back to the previous page, and this link is very similar to the next-links:

image

However, as I open the selector for the next link and chose “highlight”, it corretly identifies the icon for the next page:

Also, as I understand it, even if there are multiple elements in the page matching the selector, UiPath should still click the first one.

Still, I went down the path of trying to build an more unique selector. I tried both of these:

<webctrl tag='PAPER-ICON-BUTTON' parentid='link' class='style-scope search-paging x-scope paper-icon-button-0' icon='chevron-right' />

<webctrl parentid='icon' tag='path' d='M10 6L8.59 7.41 13.17 12l-4.58 4.59L10 18l6-6z' />

One of them targets the icon class and the other targets the svg-image in it using the “d” attribute which contains the code determining the direction of the arrow. I still have the same results, the data scraping wont proceed to page 3, but when I open the selector and use the highlight function, it seems to correctly find the element (ie the “>”-symbol).

So why wont it click the link?

I tried unchecking the “simulated click” box to use the hardware driver to click, but as I come to page 2 the mouse pointer wont even move, so it seems like UiPath cant find the element, even though it manages to highlight it when editing the selector.

Has anyone got any idea why this is? I was considering that it might be because the content is served via javascript, but so is the first page of results and it manages to find the link there.

Or is it a case of the selectors still not being unique?

Any help is very much appreciated! Thank you.

nextlink.xaml (8.7 KB)

1 Like

it worked fine for me (using chrome)…

Metadata:

<extract>
	<row exact='1'>
		<webctrl tag='search-result-item' class='style-scope search-results'/>
		<webctrl tag='article' class='result style-scope search-result-item' idx='1'/>
	</row>
	<column exact='1' name='Column1' attr='text'>
		<webctrl tag='search-result-item' class='style-scope search-results'/>
		<webctrl tag='article' class='result style-scope search-result-item' idx='1'/>
		<webctrl tag='state-modifier' class='result-title style-scope search-result-item' idx='1'/>
		<webctrl tag='a' class='style-scope state-modifier' idx='1'/>
		<webctrl tag='h3' class='style-scope search-result-item' idx='1'/>
		<webctrl tag='raw-html' class='style-scope search-result-item' idx='1'/>
		<webctrl tag='span' class='style-scope raw-html' idx='1'/>
	</column>
	<column exact='1' name='Column2' attr='text'>
		<webctrl tag='search-result-item' class='style-scope search-results'/>
		<webctrl tag='article' class='result style-scope search-result-item' idx='1'/>
		<webctrl tag='div' class='abstract layout horizontal start style-scope search-result-item' idx='1'/>
		<webctrl tag='div' class='flex style-scope search-result-item' idx='1'/>
		<webctrl tag='div' class='layout horizontal start style-scope search-result-item' idx='1'/>
		<webctrl tag='div' class='flex style-scope search-result-item' idx='1'/>
		<webctrl tag='raw-html' class='style-scope search-result-item' idx='1'/>
		<webctrl tag='span' class='style-scope raw-html' idx='1'/>
	</column>
</extract>

Next link: "<webctrl id='icon' idx='2' tag='IRON-ICON' />"

Thank you for replying bcorrea!

It’s very strange that it works for you and not for me. Would it be possible for you to upload your working solution?

Could it be something with the studio version? I’m running 2019.4.5 beta 3, academic alliance edition.

I tried using the selector that worked for you, but I had the same problem. It works one time, but once I was on page 2 it would not proceed to page 3.

Funny thing, I made a new solution that opens page 2 and the runs a “click mouse” event with the selector:
<webctrl id='icon' idx='2' tag='IRON-ICON' />

And it works, I’ve attached this project in this post.
How is it possible for the selector to work in a mouse click activity, but not in the data scraping activity? It’s the same page and the same selector…NextLinkClick.xaml (5.6 KB)

My working solution is above, just used a data scrape activity when i did a search for “internet” and worked ok… i went until page 13 and manually stopped…