Unstructured Data Scrapping

Hello Everyone,
This is my workflow process for scrapping the data from linkedin.
In the search bar search any job for example: - Web developer in India. then click on People.
then extracting the unstructured data.
I am taking the Data Scrapping & Indicate the name. From that extracting the Url also & Indicate Next button for multiple pages.
For the NEXT Button id of every page is changing.
While Running the Robot the data is not scrapping from all the pages. Its scrapping from only two pages. I want to scrap the data from 10 pages.

Can anyone give me a solution on this.
Thanks & Regards
Amar

@amarekatpure

Can you check the properties of the Extract data and check Number of Items is set to 10 or 0
Also, check if from 2 page are the selectors are checking

Hope this may help you

Thanks,
Srini

Hi @amarekatpure
You can try this way also

Regards,

Properties are set to 0
When its after going to 2 nd page the bot will stop. its not going to 3 rd page. its getting only 2 page data.
Here i am putting the id of 2 or 3rd page Images in the html format.

This is a NEXT Button id: id=“ember8394”

Hi @amarekatpure

You can use a “While” loop to navigate through the pages and scrape the data.

  1. Use the Data Scraping wizard to extract the data from the first page. When you reach the Finish screen of the wizard, select the Extract URL checkbox and indicate the Next button to navigate to the next page.
  2. Drag and drop a While loop activity after the data scraping activity.
  3. In the Condition field of the While loop activity, enter the following condition:
    counter <= 10
  4. Inside the While loop, again use the Data Scraping wizard to extract the data from the current page. When you reach the Finish screen of the wizard, select the Extract URL checkbox and indicate the Next button to navigate to the next page.
  5. After the data scraping activity, use the Click activity to click on the Next button. To indicate the button, you can use the Find Element activity and select the appropriate selector that contains the changing ID of the “Next” button.
  6. Increment the counter variable by 1 in each iteration of the loop.
  7. Finally, close the browser after the loop.

Thanks!!

Hi @amarekatpure

Replace the number with * in id attribute and it works eg: id = “ember*”

Hope it works !!

Hello @Nitya
Its Not Working.

Here using the asterisk(*) its indicating different element.

Duplicate