Web Scraping only scrapes first page

Hey All,

I’m trying to scrap a site and it’s is going through all the pages and exporting to excel. However, it appears only to capturing data from the first page. All the other pages it goes through isn’t captured.

webpage: “Vacation Rentals
Metadata:

I’m just trying to extract the name of the listing and url.

Can anyone help me with this? Thanks

Bonus, if you know a way to make it immediately export to excel rather than waiting for it to time out then exporting to excel.

Thanks in advance.

2 Likes

Hi @Thang_Nguyen
Did you try spanning pages based on that you can extract data

Thanks
Ashwin.S

1 Like

Hi @Thang_Nguyen

Are you using the data scraping wizard to get this done? And in the wizard there is a option which says to take only the first 100 records. Make sure you set it to zero so that it captures all the records. Additionally through the wizard you can set the next page navigation as well…

2 Likes

Hi buddy @Thang_Nguyen

Welcome to UiPath Community
Fine
when you data scrape from a website you will be having an option at the properties of data scrapping buddy
like this, make it 0 buddy from 100 buddy (in the attached workflow have mentioned as 50 buddy)
image
then here is your xaml buddy
level (2).zip (11.9 KB)

Kindly try this and let know buddy
Cheers @Thang_Nguyen

I was using the scrapping wizard. For some reason, the selector to click the next button was working. I modified the selector, however it only collect data from the first page despite being able to navigate through all the page. I update the number of result to 0 which is everything

Thanks! However, you’re encountering the same problem I’m having. It navigates from page 1 to 2 then back to 1 in an infinite loop.

I modified the selector select the next button, however when I do this, it only captures data from the first page.

Buddy @Thang_Nguyen
it actually worked for me buddy i was able to extract the data…
may i know what errors you get by now buddy
Cheers

I’m not really sure. It might be because I’m using a new update? I was forced to update when I was opening yours. I’m attached my revised copy. I found that I need to increase the time between each page for it to load. I was able to grab the data.

However, I’m running into the issue, where it is only grabbing 48 entries per page although there is 50 entries. Do you know why this might occur?

Also, do you have a mention to end the scrap on the last page rather than waiting for it time out? Thanks.

level.zip (21.3 KB)