Hey guys! So i’ve ran into a problem. When I extract data from a web page, it automatically takes unnecessary info and I don’t know how to deselect it.
Is it possible to deselect it somehow and only take out the one block of info that I need?
Hey guys! So i’ve ran into a problem. When I extract data from a web page, it automatically takes unnecessary info and I don’t know how to deselect it.
Hello @Povilas_Jonikas - It looks like you were using Modern Design. Can you try with Data scraping activity in classic design and see how it goes. To work with classic design, Right Click on the project name >> Project Settings >> Disable modern experience
Hello Usha,
Ok, thanks, ill try and let you know!
@Povilas_Jonikas Below video for ref
Thanks Usha for the video, but i’ve already watched it. You see, the web that I am trying to scrape gives me some strange problems. I have a list on excel of over 10k url’s of the same website. I would love to go 1 by 1 to all of them and take the data that I need. The first url is doing perfect, it goes there, it takes the info, pastes it to excel and goes to another. But when he goes to the second page, the ui says that he cannot find the ui element corresponding to this selector, even tho when I inspect the elements they are the same, just the text in them changes. I would really love to get you knowlage on this one, because I am stuck with this problem for over a week now.
This is the first url and there is the element highlighted that has the info that I need:
Hi @Povilas_Jonikas ,
Could you provide us few url’s that you are trying to loop through in the Excel ?
Since it is a public website, we can analyse it from our side and provide you with a workflow if possible.
Hi Apran,
Yes sure.
This is the first url that my ui does it smoothly (mainly because I do the data scrapping steps on this url):
This is the second url:
The highlighted area is the info that I need to scrape in all urls:
Do you want to Just Extract the Info that is Highlighted only ?
If so, You need not use Table Extraction for this, you could use Get Text
Activity and get that Particular info as it loops through different URL’s.
Selector used for the Get Text
Activity :
"<webctrl parentid='product-*' tag='DIV' class='product_meta' />"
Since you are using Modern Design, you would need to use Use Application/Browser Activity and indicate the page and change the Selector to the below :
"<html app='chrome.exe' url='https://www.remparta.lt/lt/product/*' />"
Check the workflow below :
Remparta_ExtractInfo.zip (12.4 KB)
Let us know if this is not required output.
Thanks a lot! It works perfectly!
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.