Move to next page in data scraping

Hi
I want to scrap multiple pages of a website but the next button is not available. I tried

and many more topics but still unable to find the solution. What tried is:
first method:

  1. Used a do-while loop. scraped the data of the first page.
  2. then click on next page using aaname=*
  3. check for element exists for next page.
  4. next page exists = true in while condition.
    In this case, it scraps the first page then move to the second page and again come back to the first page. It got stuck in the loop until I force quit it.

second method

  1. I used a do-while loop. scrap the first page then added a counter as pagecount stating with 2
  2. click next page with aaname=pagecount.tostring and then increment the counter.
    In this case getting an error as “selector not found” for click activity
  3. Increment the counter
  4. while condition is same as previous.

third method
NextLinkSelector

  1. in this I gave selector with aaname =* but again it got stuck in the loop of the first and second page.

please suggest me if there is any general solution.

Hi @Atul_Rai
the selector need to be tweaked, or the properties of the scrapping activity need some changes.
can you share the website here?

sorry @reda
I cant share the website. but you can tell me the suggestion using any dummy website.

Ok so when you said

I will assume that you mean it’s there but you can’t find a reliable selector for it.
so here is how you can tweak the selector:

  • Add * to your properties one by one and see if this solves the problem
  • Eliminate properties one by one
  • check parent selectors and try it, if it doesn’t work keep go up until you find the one

this is all I can say giving the fact that I don’t have the website, every selector is different so you have to find the one appropriate for your case.

Thanks @reda
I will try your options to get the stable selector.
I mean is that my page do not have arrow or a button where I can click to go to next page. The page only have a list of pages as 1 2 3 and so on not any “>” to move to next page.
Thanks again for your help.

can you share a screenshot from uiexplorer pointing to page 2 link and page 3 link?
Something like below. Make sure the property explorer is pinned and displays the values.

sure

Awesome! Now you can create a dynamic selector with the page number.
Say if your page number is stored a variable pageNo. The selector would be :
"<html app='chrome.exe' url='*172.16.105.80*' /><webctrl aaname='" + pageNo.ToString + "' parentid='pagination' tag='A' />"

You can use this selector with Element Exists activity. If it returns true, then you can use the same selector to click on the page. Continue with your data scraping as done in page 1! :+1:

thanks, @kaderms
But cant understand this line

I meant you can store the page number in a variable, for example pageNo

The number of pages is not fixed. It changes always.
from where I will get the number of pages?
can you tell me the steps to design the workflow??

Hi There,

I faced with a similar problem a few days ago. I spent a lot of time but finally solved it.

Here is how;

  1. I found an ui element which has the table’s id as parentid for example table_id = asdf_1234-t, elements_parentid = asdf_1234-div …
  2. At first I tried to get a selector value for the table which doesn’t differentiate but could not find.
  3. And then I found that element has table’s id as its parentid and I could reach that element with the constant attributes.
  4. I used “Find element tool” with the constant selector of the ui_element.
  5. finally, with “get attribute” I got the parentid of the found UI element. Changed it with string operations from asdf_1234-div to asdf_1234-t. And defined the selector outside as string. It worked!

You may use this approach. Trust me I spent a lot of time on it :slight_smile:

Regards,

Oguzhan

okay. here is your pseudo, after you reach Page 1 :

  • Assign pageNo = 1 (int variable)

(Use a Do While loop like below)

Do

  • If pageNo>1
    Then : Click - Target Selector : "<html app='chrome.exe' url='*172.16.105.80*'/> <webctrl aaname='" + pageNo.ToString + "' parentid='pagination' tag='A' />"
    (End if block)

  • Data Scraping : Scrape from Page and store / process the scraped data as per your scenario

  • Assign pageNo = pageNo + 1

  • Element Exists - Target Selector : "<html app='chrome.exe' url='*172.16.105.80*' /> <webctrl aaname='" + pageNo.ToString + "' parentid='pagination' tag='A' />"
    Save the result in a boolean variable nextPageExists

While nextPageExists = True

Hope this helps.

2 Likes

Thanks @kaderms
Your suggested code look similar to my second method with a addition of if block. I will try your method. I think it should work.

1 Like

Yup it is. I was helping you resolve the selector not found issue. :slightly_smiling_face:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.