I used to be able until very recently to be able to scrape web data tables from a very popular social media site LinkedIn
I was able to search a keyword and then extract all the names of the people from the resulting query both text and URL from all pages. Thousands of names and their profile link
Today I tried again and I could not
The metadata from page 1 of results does not match anymore as in the past the metadata of page 2
Therefore now even if you go to page 2, the extract data table activity is not able to extract the list of names from page 2, the table is empty, even with the new 23.10 version which came with some upgrades regarding fuzzy selectors etc
Any ideas how we can get those data again?
P.s. I realized this also happened with Google results. Google now has that never ending scroll (not pages anymore) which seems hard to scrape (even though I did not try)
I have tried now to do the same extraction on Edge browser, however it just stops after page 2 without any reason. But at least page two has also data, which is good news
Please try to replicate the below steps and check if it works.
Extract data table without spanning multiple pages. Extract it only for one page.
check with element exist if the next page icon is available.
If yes, click and extract the data table.
Write it to excel
Continue the same steps in a loop until the end of the page.
By this method, you can get the data from all the pages. However, you will have to make sure to store the extracted data from each page for example by writing to an excel.