Data Scraping Multiple Pages - Last page has fewer results

Hi,

So I am trying to scrape data from a site with multiple webpages. I tried to do this through the data scraping wizard, but quickly ran into an issue: although the website displays 8 results a page, the total number of results does not divide evenly by 8. What this means is that on the last page, it has less items than the previous pages. Due to this (I think), the scraping freezes on the last page, and never finishes.

Initially I thought this was due to it not finding the “next link selector” on the last page, but that is an actual feature of the scraper, as it tells it when to stop, so that can’t be the issue. I think this is the problem. Anyone know how to fix it?

@Kabir_Nagrecha,

that is not an issue at all. Data scraping can easily identify the number of rows present in each page even there are uneven numbers in each page. That won’t cause any issue. If possible, if you are getting any error, attach the screenshot

There is no error. It just freezes on the last page, for several minutes, until I manually end execution. It works fine if I limit MaxResults to exactly the number of items present, but I’d prefer to be able to use 0 so I can generalize it to other sites better. However this doesn’t seem possible at least until I’ve solved this issue.

Also here is an image of what the last page results are like:

Don’t change any thing while scraping the data. Let the max results be 1000. I hope many of the Robot Masters and Certification exams have the data scraping and we all are able to do that successfully. I hope if you have any other activities after data scraping, Can you try commenting out those and try only data scraping with no changes in the properties of data scraping activity

Hi, I just tried it with only the data scraping and nothing else and it still froze at the end. I’ve attached the workflow I used just now: Sequence1.xaml (5.8 KB)

It looks good. So what are you trying to do after this?

Well the issue is that this code itself does not work. The workflow I just sent does not work as it freezes on the last page. I assumed that it was due to the irregular number of items on the last page - could there be another issue?

That won’t be the issue i guess.

Lets do some testing with someother things here.

Open amazon or flipkart and get all the URL’s and the Names of a product from multiple pages. If it also gives the same error, then we will raise a bug to the team UiPath.

Can you please try and let me know?

Hi,

I had the same error on amazon with query “cisco old phone”. It provided 3 pages of results, and it was frozen on the last page as before. I wonder if it perhaps has something to do with the lack of a “next” button on the last page? I thought that this was actually how it knew it had come to the end however. Thanks for your help so far.

@Kabir_Nagrecha @HareeshMR

How to solve this issue ? I am also facing same issue on amazon data scrapping

Can you post the error you are getting? And make sure you have installed the extension required for that particular browser you are using?

  1. There is no error displayed on the screen. But I know my amazon webpage table has 99 rows on 4 page. Mean 25 row on each page.

starting 3 page (75 row) scrapped well. and last page have 24 row. But UIPath scrapper only scrap 4 or sometime 5 row, but not all the rows from last page.

Also, the last page freeze for 30 sec to find next button.

2.Yes…browser extension installed properly & working fine on other activities.

yeah got it. This was something strange and good point to discuss. I hope this will happen every time right? or sometimes it is retrieving the entire data?

@HareeshMR
I tried around 10 time. And its happening every time.

Then What Solution I Tried
Later I increase the delay between page load by 30 seconds ( DelayBetweenPageMS Properties in Extract Table Activity ) … Then its started to extract whole table.
:heart_eyes: Problem Solved :heart_eyes: but with starting an another new problem.

What Is The New Problem
The bot stop for 1 minute on the last page. Then it moves to the next activity. Its feel annoying & time wasting. Why its freeze on last page ?

What May Be The Reason
I am not sure… But I think its its wait for 1 minutes because

  1. it will wait for 30 sec because of timeout property of extract data table activity.
  2. it will wait for 30 sec because of DelayBetweenPageMs property

I think on the last page also our bot wait for next page.

Please tell me what I do so our bot not wait on last page so much time.

@HareeshMR
:sleepy: Problem Not Solved :sleepy:
When I run the bot now…and its does not extract whole data again. My solution mentioned above does not work. I think UiPath Extract data activity have bugs.

I’m getting the same type of issue. No error is thrown in my workflow, but it freezes on the last page and ends up not putting any data at all in the target data table for the extracted data. Was a solution ever found?
The project I’m working on is actually one of the Practice with ReFramework projects, specifically the Calculate Client Security Hash project.
(Also just a note - I can’t upload my workflow at the moment because I’m a brand new forum user)