I try to get products table from market with a web scraping. I select the table and next page button succesfully.
After I run the robot it was only able to move to the next page. Then I control the “NextLinkSelector” properties.
Page parameter in the href attribute increase for next page. If I change the pg=2 with pg=* the robot always go page 2 and page 3.
How can I increase this parameter for every page?
Yes… So you need to pass a parameter instead of the “*” in the selector right? You can actually do that using a variable.
So have a variable in type int32 first and assign it to 1 as the page numbers starts from 1. Then, in your selector, have the counter variable added to that location as
"href='https://www.n11.com/arma/q=pil&pg='" + Counter.ToString + "'"
However, If you are using the data scraping wizard, just make sure you scrape one page at a time. So in first iteration, scrape the first page data. Then increment the counter variable. Then scrape the next page like wise…
Get the idea?
Is there on website a button like “>” for next side instead of number?
Hi @Pablito ,
This is the selector of “>” button.
In that way you could make a loop in which your selector could be a variable which number would increase by 1 for each iteration.
Hi @Pablito and @Lahiru.Fernando
Thank you for your responses. I have a similar idea. But the main problem is that, when should I take this incremental? My sequence is below
Please correct me if I mistakes but I think this scraping automatically done in “Extract Structured Data” block. So I can’t put in “Assign” block for counter = counter +1 inside the “Extract Structured Data” block.
Instead of that should I create url dynamically and load this url and scraping this url for every page? In this scenerio first I should learn the page count in site for avoiding 404 not found error.
You can put
Extract Data and
Write CSV inside
While loop. Declare variable with
Assign activity like “Variable = 1”. Then inside the loop after all activities put again
Assign activity with: “Variable = Variable +1”