Scraping next pages

Hi forum,

I’m trying to scrape a website with multiple pages.

The tricky thing is that the pages are numbered in batches of 10, like
1 2 3 4 5 6 7 8 9 10 > 60 where 60 is the last page.

When you click on “>” the next set will be presented:
11 12 13 14 15 16 17 18 19 20 > 60

Clicking again on the > will show 21 through 30, etc.

Does anyone have a clue how to loop through all the pages?

Thanks

@eparijs
You can define a counter variable counting the Pages that have been Processed through the iterations.

With a Logic IT can handle the different click on >. This Logic can be e.g implemented with the Modulo function. Countervar Mod 10 = 1 do clicking on the >, 21 Mod 10 =1…

Just for the First Page you have to click on 1, so this scenario is to handle extra. And Sure BOT should Stop Forward clicking once the Last Page is reached

Hi @eparijs
While you are doing data scraping keep maximum number as 0
And
Once you finished the extraction it will ask the question” is data spanning multiple pages?” If you are extracting the data from multiple pages then click on Yes and indicate >
So it will automatically scrap 60 websites

Cool,
Regards,
Gulshiyaa

Hi Gulshiyaa,

Thanks for your reply.

What you describe is the normal way you would do this, but the problem is that when you click on
“>” you’ll get the next series of 10 pages, not the next page.

So when you have this on you screen:

1 2 3 4 5 6 7 8 9 10 >

When you click on
“>” page number 11 will be shown, not page 2.

Any clue ?

Thanks

image001.png

Can you please send me the link Which your scraping

@eparijs
Go with this

after you reach Page 1 :

Assign pageNo = 1 (int variable)

(Use a Do While loop like below)

Do

If pageNo>1
Then : Click - Target Selector : "<html app='chrome.exe' url='172.16.105.80'/> <webctrl aaname='" + pageNo.ToString + "' parentid='pagination' tag='A' />"
(End if block)
Data Scraping : Scrape from Page and store / process the scraped data as per your scenario
Assign pageNo = pageNo + 1
Element Exists - Target Selector : "<html app='chrome.exe' url='*172.16.105.80*' /> <webctrl aaname='" + pageNo.ToString + "' parentid='pagination' tag='A' />"
Save the result in a boolean variable nextPageExists

While nextPageExists = True

Hope this helps.

1 Like

That looks great, Gulshiyaa, thanks!

I will try it later, but it all makes sense
:blush:

I’ll let you know tomorrow.

Thanks again

image001.png

Hi Gulshiyaa,

Sorry, I missed your question for the link:

https://www.remax.pt/officeagentsearch.aspx#!mode=list&type=2&regionId=12&regionRowId=78&provinceId=&cityId=&localzoneId=&name=&location=Porto&spokenLanguageCode=&page=1&countryCode=PT&countryEnuName=Portugal&countryName=Portugal&selmode=residential&officeId=&selectedCountryID=&initialRegionId=12&defaultRegionRowId=&defaultProvinceId=&defaultLocation=

I’m going to give your solution a try later today.

Thanks

image001.png

image001.jpg

Note that you have in the URL the page number that you could keep adding one to it… using navigate to activity until all pages are done…