How to scrape data from a webpage (html) that is locally saved on my computer?

Hi,
I have saved a webpage from chrome, lets say “ABCD.html”, stored in a local folder.
I want to open it and scrape data from it (There is a table that I want).
I have tried web scraping, data scraping, but nothing seem to work.

Another problem is that when I open that page, it always opens up in ‘EDGE’… is that the cause of issue?
Can you pls help.

1 Like

Hi @aishwarya.gupta

1st check that if your Edge Browser have UiPath Extension added

If not then add it 1st bcz unless and untill Extension is not added then web activities wont work

For adding the same do the following steps if you are using Community Edition

cd C:\Users\USERNAME\AppData\Local\UiPath\app-20.4.2\UiPath
SetupExtensions.exe /Edge

For adding the same do the following steps if you are using Enterprise Edition

cd C:\Program Files (x86)\UiPath\Studio\UiPath
SetupExtensions.exe /Edge

And then trying doing your steps again

If the data you want to scrap is in structure/Pattern format or tabular format then you must use Data Scrapping and it should work with EDGE also unless and until the Extension is present

And if you dont want the webpage to be open in EDGE then change the default setting of browser from EDGE to Chrome and edd extension to chrome as well.

Mark as solution and like it if this helps you :slight_smile:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hi, Thank you for your quick reply, but it is still not working.
Even if I open that webpage in chrome (with help of ‘Open with’), I am unable to perform scraping in that.
However, datascraping works fine when scraping from browser.

Hi @aishwarya.gupta

can you provide the SS for the data which is present der on that Webpage or the webpage link which you are saving

So that it would be easier to look after it.

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

XE Currency Table_ GBP - British Pound.html (80.7 KB)

Hi @aishwarya.gupta

I am able to scrap the data

Output :-

Here is the xaml File :-
MainPratik.xaml (9.3 KB)

Check your Extension for EDGE and Chrome is it enable or not
image

Mark as solution and like it if this helps you :slight_smile:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hi,
which scraping did you use? Screen scraping?
your xaml file is opening like this

And chrome extension is enabled, I am sure, Edge extension, I have done now as you guided me.

Hi @aishwarya.gupta

Its just data scrapping
image

And as of now i am using below version of UiPath
image

image

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Mine is Studio 2020.4.3
Enterprise License
Windows Installer

Also, Edge extension is enabled now, confirmed.

And if you used Data scraping, how come your output is showing like this? Shouldn’t it be in tabular format?

Hi @aishwarya.gupta

It should work der also in Enterprise version of yours

After scrapping the data from webpage i have used the Output data table activity for you to show the output on message box.

Output Datatable activity converts data into String format so in message box it looks like that i have shown previously.
Just to confirm that whether data got extracted properly i used that activity.

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Unfortunately this is not working on my system. I dont know why.
I know about the activites you mentioned, completed the Advanced Developer certification in 2018 :wink:

Just that this scraping thing is not working today ( on a saved webpage) , normal scraping on a browser page is working fine.

Aneways, thanks for your help.

One more thing, when you downloaded and opened the webpage I sent you, in which browser was it opened? and what was the URL on top that you were getting?

When I open, it says something like this :
“file:///C:/Users/aigupta/Documents/UiPath/POC5_CurrencyExchangeRates/2020-07-14/XE%20Currency%20Table_%20GBP%20-%20British%20Pound.html”

Hi @aishwarya.gupta

After downloading the webpage and i just click on it then by default it got open in chrome in my system. As of now i have made chrome as my default browser so.

The below is the URL i am getting :-
file:///C:/Users/prani/Desktop/Practise%20Solutions%20for%20Forum/XE%20Currency%20Table_%20GBP%20-%20British%20Pound.html

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hi @aishwarya.gupta

As of now i tried scrapping the same data from the webpage you provided using EDGE browser but it is not scrapping the data.
As it is taking the whole screen to scrap/spy. So here on my system also Data scrapping for that webpage it is not working for EDGE browser.

Mark as solution and like it if this helps you :slight_smile:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Yes, it is taking the whole screen to scrap/spy for me as well… but on chrome as well as Edge, both.
Aneways, thanks a lot fr your help. will see what I can do about it.

Hi @aishwarya.gupta

Have a look on chrome extention with all this below things enable
image

One more thing for chrome browser as per my experience.

  1. try to delete all cache and all things
  2. close the browser and reopen it
    and then try to do your procedure again. i.e scrapping data

As this same i have faced once so i did the same and after that it starts working fine for me perfectly

Mark as solution and like it if this helps you :slight_smile:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

1 Like

Yeah, now going to clear everything and restart the system itself.
And yes, checked all these settings, everything is how it should be. Lets hope a system reboot will work.

Thank you.

Hi @aishwarya.gupta

Yup. Let me know if it worked for you too.

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

It worked last night when I restarted the machine. :slight_smile:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.