Data Scraping on Web never ends

abyvarghes · January 1, 2019, 5:05pm

Hi All,

I have been trying to fetch some data from web page here
But it works fine for first 24 records. But if I specify “Next” its going through all the pages and never end. So its not stopping after it fetch eg: 100 records. Similar topic (here) is also in the forum but can’t get my issue fixed with that solution. Any help would be highly appreciated.
TestWebScrape.xaml (14.0 KB)

KarthikByggari · January 1, 2019, 5:25pm

Please change the default settings in the option panel of the activity of data scraping.

https://files.readme.io/f7ff964-image_174.png

Please change the value to 0 as suggested in the screenshot.

Regards,
Karthik

abyvarghes · January 1, 2019, 5:39pm

Hi Karthik,

I had tried that as well But no luck. I believe its all about the selectors. Would be great if you could just look at my selectors of NextLinkSelector

Thanks,
Aby

KarthikByggari · January 1, 2019, 5:43pm

Okay. Let me look into your workflow and will update you on this.

abyvarghes · January 6, 2019, 9:16pm

Hi Karthik,

Just wondering you have had any time to look in to this issue?

Thanks,
Aby

loginerror · January 7, 2019, 9:34am

Hi @abyvarghes

You have to tune your Data Scraping activity, as it doesn’t scrap all items. I let it run for a 1000 records and then closed the IE (which results in the scraping finishing and saving to file).

It only scraped 24 records out of the 1000 it saw, which is way below the cap of 100 and the reason it keeps running.

abyvarghes · January 8, 2019, 5:19am

Hi @loginerror,

Thanks for the response,

This is what I have tried after your reply,

\ Recreated the scraping part
\ Put a delay activity on web page loading and inside Data scraping to make sure the web-page completely loads

Still getting the same issue.

Also sent the same Workflow to one of my friend who works on UiPath and he ran it on his system (Licence Edition) and it works perfectly.

So it could be something to do with the Community edition?

Thanks,
Aby

loginerror · January 8, 2019, 8:53am

First of all, there is no functional difference between Community Edition and Enterprise Edition. So that is strange.

However, for me it also did not work initially as it should.

What I did to fix it.
I removed first few lines from the ExtractMetadata xml, see below:

Old:

New:

This seems to have fixed the issue. You do still get some duplicates in the results, however (for different colors of the models.

Could you try this attached project and see if it works for you?
ScrapingWbModified.zip (1.4 MB)

abyvarghes · January 8, 2019, 9:58am

@loginerror

Thanks a lot. That fixed the problem.

As I don’t have much knowledge on XML. May I what are those 2 line we removed from each block? What it does to the output / execution?

And do you suggest any learning documentation for XML if that helps?

Thanks,
Aby

loginerror · January 8, 2019, 10:07am

The documentation knowledge surrounding the xml code within the tool is a bit anecdotal and comes from the “forum experience” and playing around with the tool. It definitely requires some more input and I believe our documentation team is aware of it and will document it at one point

Basically, the xml in that field is a literal “path” to your element on the page. If it happens to be too specific, it will only find the values that match the path 100%. In this case, it was only catching a few records for thousands it was exposed to.

I started removing 1 line at a time and rerunning the project to see if it works. By removing 2 lines I must have removed enough of the “too specific” path to allow it to catch all the needed values.

abyvarghes · January 9, 2019, 9:34am

@loginerror Thanks a lot again.

system · January 12, 2019, 9:34am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Data Scraping Multiple Pages - Last page has fewer results Help	15	2727	December 16, 2021
Scrape data from a website without "Next" Button -does not save all data Academy Feedback	12	3598	July 21, 2019
Data scraping is not working as explained Activities uiautomation , activities , question	12	1818	February 2, 2021
Data Scraping is not working for all the pages Off-Topic Discussions studio	7	2785	October 3, 2018
Issue with Data Scraping on Last Page Learning Hub studio	4	1033	June 1, 2020

Most Active Users - Yesterday
prashant1603765
yedukondaluaregala
ashokkarale
sharazkm32
mively
sonaliaggarwal47
VanjaV
pikorpa
singh_sumit
David_Hernandez2
More details...

Data Scraping on Web never ends

Related topics