Web Table extraction

Hi Guys,

I used to be able until very recently to be able to scrape web data tables from a very popular social media site LinkedIn

I was able to search a keyword and then extract all the names of the people from the resulting query both text and URL from all pages. Thousands of names and their profile link

Today I tried again and I could not

The metadata from page 1 of results does not match anymore as in the past the metadata of page 2

Therefore now even if you go to page 2, the extract data table activity is not able to extract the list of names from page 2, the table is empty, even with the new 23.10 version which came with some upgrades regarding fuzzy selectors etc

Any ideas how we can get those data again?

P.s. I realized this also happened with Google results. Google now has that never ending scroll (not pages anymore) which seems hard to scrape (even though I did not try)

I have tried now to do the same extraction on Edge browser, however it just stops after page 2 without any reason. But at least page two has also data, which is good news

Why does it stop after page 2?

@raool90

first try to scroll down and then extract

cheers

1 Like

HI @raool90

Please try to replicate the below steps and check if it works.

  1. Extract data table without spanning multiple pages. Extract it only for one page.
  2. check with element exist if the next page icon is available.
  3. If yes, click and extract the data table.
  4. Write it to excel
  5. Continue the same steps in a loop until the end of the page.

By this method, you can get the data from all the pages. However, you will have to make sure to store the extracted data from each page for example by writing to an excel.

Hope this helps! Cheers!

1 Like

Exactly mate, I also found out just now before reading your message. Keep it up Anil! Happy Automation!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.