Table extraction patterns removal

Hey guys! So i’ve ran into a problem. When I extract data from a web page, it automatically takes unnecessary info and I don’t know how to deselect it.

Is it possible to deselect it somehow and only take out the one block of info that I need?

Hello @Povilas_Jonikas - It looks like you were using Modern Design. Can you try with Data scraping activity in classic design and see how it goes. To work with classic design, Right Click on the project name >> Project Settings >> Disable modern experience

1 Like

Hello Usha,
Ok, thanks, ill try and let you know! :slight_smile:

@Povilas_Jonikas Below video for ref

1 Like

Thanks Usha for the video, but i’ve already watched it. You see, the web that I am trying to scrape gives me some strange problems. I have a list on excel of over 10k url’s of the same website. I would love to go 1 by 1 to all of them and take the data that I need. The first url is doing perfect, it goes there, it takes the info, pastes it to excel and goes to another. But when he goes to the second page, the ui says that he cannot find the ui element corresponding to this selector, even tho when I inspect the elements they are the same, just the text in them changes. I would really love to get you knowlage on this one, because I am stuck with this problem for over a week now.
This is the first url and there is the element highlighted that has the info that I need:

Second url and the element highlighted that has the info that I need:

And this is how my Data scrapper looks like inside:

Im looking forward to hearing from you! :slight_smile: :smile:

Hi @Povilas_Jonikas ,

Could you provide us few url’s that you are trying to loop through in the Excel ?

Since it is a public website, we can analyse it from our side and provide you with a workflow if possible.

1 Like

Hi Apran,

Yes sure.

This is the first url that my ui does it smoothly (mainly because I do the data scrapping steps on this url):

This is the second url:

The highlighted area is the info that I need to scrape in all urls:

@Povilas_Jonikas ,

Do you want to Just Extract the Info that is Highlighted only ?

If so, You need not use Table Extraction for this, you could use Get Text Activity and get that Particular info as it loops through different URL’s.

Selector used for the Get Text Activity :

"<webctrl parentid='product-*' tag='DIV' class='product_meta' />"

Since you are using Modern Design, you would need to use Use Application/Browser Activity and indicate the page and change the Selector to the below :

"<html app='chrome.exe' url='*' />"

Check the workflow below : (12.4 KB)

Let us know if this is not required output.

Thanks a lot! It works perfectly!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.