Extract Content from various news using Screen Scraping Function

uiautomation
web

#1

Hello all,

I am trying to extract content from various news from different sources with the screen scraping function, and check if the extracted text contains certain words such as “UI Path”.
However, I encountered the following problems:

  1. Get text function will load before the website is loaded completely. As the internet icon varies depending on the news website, I tried to use the image vanish function on the following icon that appears in the tab when the website is loading, but it doesn’t work as the icon is changing while loading.
    image

  2. As the website varies from time to time, my selector is unstable and the get text function is not correct most of the time.
    e.g.
    https://www.scmp.com/news/world/united-states-canada/article/2155617/ex-employee-zhang-xiaolang-denies-stealing-apples
    https://www.cnbc.com/2018/07/12/stormy-daniels-arrested-in-columbus-ohio-while-performing-avenatti.html?recirc=taboolainternal

The selector I am using for the above two news link would be as follows:

Attach Browser selector:

Get full text selector:

Hope you guys can provide some ideas for me to proceed this foward.
Thanks!!


#2

Hi @annalyy

Website are always tricky to automate. Personally, I had most luck with a simple Delay activity that delays the process for a few seconds while waiting for the website to load.

You could also try the solution of @bogdanripa from this post, but you will need to wait for him to update his selector :slight_smile:

Turns out the info is available in the documentation of the Get Attribute activity:

image

Therefore you can try a Get Attribute activity with a selector "<webctrl>" and readystate as the attribute name :slight_smile:


#3

Thanks!!