Data scraping is not working as explained

Hi,

Good Day.

I have a situation where I was checking the data scraping on a specific web edi. I was able to download the data however, some of the data was duplicated.Datascraping.txt (2.9 KB)

I was to supposed to get the data until 25.10.2021,1,480,480,DI however the second sheets data has been duplicated.

image
I have set the MaxNumberOfResults to 100 however if I want the extract exact rows I have to set it 0 as per the documentation. When it set it 0 then this goes in an infinite loop. To check if it works I let it run for some time and even after 10 mins it did not stop so I had to manually stop it.



Any hint or feedback would be helpul.

Regards,
Manjesh

Hi,

It seemed to fail to click next button. Can you try to uncheck SimulateClick property and/or review your NextLinkSelector?

Regards,

Hi @Yoichi ,
I have removed the Simulate click yet the same result.

This is the next link selector
webctrl name=‘main’ tag=‘FRAME’ /><webctrl aaname=‘>’ tag=‘DIV’ /

Hi,

Thank you for sharing.
May i know if next button is clicked by bot and go next page during data scraping running?
We might need to set large number in DelayBetweenPageMS property (such as 5000, for now)

Regards,

Hi @Yoichi,

Yes the next button is clicked by the bot. I will check the DelayBetweenPageMS property and let you know.

Regards,
Manjesh

Hi @Yoichi
The result is still the same, it is just delaying the by secs for every click.

Regards,
Manjesh

Hi,

Thank you for trying.

It seems that next link selector needs to be reviewed. Is the site on internet and can i access it?

Regards,

Hello @Yoichi ,
I am absolutely sorry this is one of our customer portal so I cannot share it.

Regards,
Manjesh

1 Like

Hi,

All right. Another approach is to use loop for each page and datascraping without next link selector like the following. Then we can check various condition to exit loop.

Regards,

Hi @manjesh_kumar,

It seems that the button used to navigate to next page is still available on your last page.

However, in order for UiPath to work properly, the button to navigate to the next page should not be available if you are actually on the last page.

But since you cannot modify the customer’s portal, you can still use your initial workflow and, as a corrective action you could try to remove the duplicate rows.
The Remove Duplicate Rows activity https://docs.uipath.com/activities/docs/remove-duplicate-rows should help you to do this.

This could be an alternative to @Yoichi 's solution.

Hope this helps
Best regards,
Marius

1 Like

Dear @Marius_Puscasu ,

Thanks for the information, it is much easier than @Yoichi’s solution. Thanks @Yoichi for the solution too.

3 Likes

Could you please explain where was problem manjesh i am new here

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.