While scrolling elements become visible

Ali_Sir_Aydemir · June 29, 2023, 11:06pm

I want just basic table extraction, the web page has the next page button but I need to scroll for every page. Any idea how to do it?

Yoichi · June 30, 2023, 1:16am

Hi,

If we can click next button by Simulate or ChromiumAPI, it’s no problem even if the target exists outside screen (if it’s loaded) . Can you try this?

Regards,

argin.lerit · June 30, 2023, 1:23am

Hello @Ali_Sir_Aydemir - Is the webpage accessible to the public? Wanted to check the nature of the table as some tables like https://datatables.net/ may not have all the data in the DOM (i.e. rows not visible get removed from the DOM as you scroll past them). Wanted to make sure this is not the issue you’re experiencing.

Thanks!

Ali_Sir_Aydemir · June 30, 2023, 2:46pm

https://www.medifind.com/conditions/chronic-fatigue-syndrome/1135/doctors? You can check the website. Thank you

Ali_Sir_Aydemir · June 30, 2023, 2:48pm

Thank you and It didn’t work. What I thought was a great idea is Do while(Scroll until Next Page Element is Visible then click next) but I got trouble setting the workflow.

postwick · June 30, 2023, 2:55pm

You don’t have to scroll for Table Extraction to work.

ppr · June 30, 2023, 2:55pm

please share with us the selector which you had configured for the next button. Use the </> button from the editor for sharing it. Thanks

Ali_Sir_Aydemir · June 30, 2023, 3:05pm

<webctrl aaname='Next' parentid='mf-root' tag='SPAN' type='' class='Button_label__4FHaL Button_normal__4rQXo' />

ppr · June 30, 2023, 3:08pm

test with:

<webctrl aaname='Next' tag='SPAN'  />

Ali_Sir_Aydemir · June 30, 2023, 3:09pm

It works but just for 2 visible elements, there are more than 2 so, it kind of doesn’t work for me.

Ali_Sir_Aydemir · June 30, 2023, 3:17pm

Works fine for the first page but didn’t click for the next page on the second page. (with simulate)

ppr · June 30, 2023, 3:18pm

Tell us with which search criterias you had tested, so we can try to replicate it

Ali_Sir_Aydemir · June 30, 2023, 3:22pm

Do you mean which data I am trying to scrape from that website? You can just try for Doctor Name.

ppr · June 30, 2023, 3:39pm

we asked for to have the same scenario as you are testing. When using any other criteria we do have looked at a different set.

However, the next button is working. The problem is, that the Extract Data is more slow and grabs not a page complete. Therefore we do have less rows.

You can try if the out-of-the-box settings allows you to better sync.

Otherwise, you can remodel your approach and take more care about sync and extraction. Paging can be done with url trick as just the page number has to be increased

till you will get:

postwick · June 30, 2023, 3:41pm

It got all 20 from the first page with no issues for me. I didn’t manually change anything. Just a few clicks in the Table Extraction wizard.

There does seem to be an issue where it doesn’t get the data from page 2, however, and I think that’s because the page doesn’t reload when you click Next - it just refreshes the list portion.

Ali_Sir_Aydemir · June 30, 2023, 4:04pm

I will create a CSV file and open each link and try to scrape it. Thank you so much.

Stane · June 30, 2023, 5:11pm

What i did when i had the same scenario are couple of simple solutions:

Send keyboard shortcut End inside the Use Browser activity, which scrolls to the bottom of the page which loads all the the neccessary data.

Also, for some other cases i also found it useful to use Inject JS Script activity and check if the page is loaded with a function, return 1 or 0 depending on whether the page is loaded or not before starting to scrape in a while loop until it loads.

I hope this helps

postwick · June 30, 2023, 5:11pm

The issue is that the page doesn’t reload when Next is clicked, which I think is preventing the extract activity from realizing a new page of data is available. So you’ll have to do something like this where you just have Extract do one page, merge it into a final datatable, then check if next exists and click it…

Ali_Sir_Aydemir · June 30, 2023, 5:57pm

So, I create a script to generate links, then I opened every page and scroll down to the end of the page and wait 3 seconds and extract data. There were multiple columns name for every page and NBSP, for that:

import pandas as pd

# Read the CSV file into a DataFrame
df = pd.read_csv('doctor_info.csv')

# Remove unwanted characters from the 'Adress_Line_1' and 'Adress_Line_2' columns
df['Adress_Line_1'] = df['Adress_Line_1'].str.replace('\xa0', '')
df['Adress_Line_2'] = df['Adress_Line_2'].str.replace('\xa0', '')

values_to_remove = ['Name', 'Url', 'Adress_Line_1', 'Adress_Line_2']
df = df[~df.isin(values_to_remove)].dropna()

# Save the modified DataFrame to a new CSV file
df.to_csv('filtered_doctor_info.csv', index=False)

edit: thank you!

Ali_Sir_Aydemir · June 30, 2023, 6:02pm

Totally agree with your solution but what works for me is to save all HTML pages and deal in Python :D. But for this scenario, I did something similar. Thank you!

Topic		Replies	Views
Table extraction and the scrolling Studio uiautomation , considering , feedback	5	2834	January 2, 2023
Problem with the next page in Data Scraping Help	3	1904	December 30, 2018
Data scraping next page not working (Chrome) Studio studio , question , activities_panel	11	628	March 10, 2023
Data Scraping for Webpage - Non visible Data Studio studio , question , activities_panel	9	716	November 24, 2022
Issue with "For Each UI Element" Not Navigating to Next Page Studio studio , question , activities_panel	6	22	April 3, 2025

While scrolling elements become visible

Related topics