Reading a HTML and extracting information

RaVillalobos · May 1, 2022, 1:38am

Hi community,

I have an issue with a project, I need to extract the information from a Coupa file that comes in an HTML format, the machine uses the browser to open it. I need to extract the information from that file but I am not sure if there is an activity that lets me extract the info just like in a PDF.
What I did was convert the HTMLs to PDFs, using CTRL+P and saving the file, however sometimes the selectors or hotkeys are not working properly and the file is not saved.
Is there some tool I can use or an activity/method to avoid using selectors for that?

Appreciate any help or advice

Nithinkrishna · May 1, 2022, 8:21am

Hey @RaVillalobos

This is possible in a programmatic way and also make sure you have the rules to extract data.

Install HTMLAgilityPack .Net library which will help you to parse HTML and get appropriate element value as we do in selectors.

Thanks
#nK

Rahul_Unnikrishnan · May 1, 2022, 6:30pm

Hello @RaVillalobos ,

What are the data that you are scrapping from webpage? Any Table data or field data?
Any challenges you are facing with the web page automation?

Because sending hotkeys and converting to pdf will not be reliable and can get fail. If you can proceed with webpage automation that will be better.

Erick_Chavarria1 · May 2, 2022, 3:24pm

Open the html file in chrome browser, then use print option and save it as pdf.

Randy_Villalobos · May 2, 2022, 6:45pm

Thanks for replying,

Is this the one?

I am trying to Read the text from an invoice in HTML format, just like a ReadPDF Activity

Randy_Villalobos · May 2, 2022, 6:46pm

Thanks sir,

I am trying to extract the details from an invoice in HTML format, use it like a Read PDF without converting it to PDF first

postwick · May 2, 2022, 7:10pm

Open it in the browser and use activities like you would for any normal web page.

Nithinkrishna · May 3, 2022, 1:09am

Hey @Randy_Villalobos

Yes the first one.

Thanks
#nK

Topic		Replies	Views
How to extract data from html file Help activities , studio	11	9214	May 11, 2021
Data Extraction from PDF to website Help pdf , activities , question	12	1324	February 4, 2021
Extract specific data from .html file Activities uiautomation , activities , question	30	3174	September 24, 2021
Unable to extract specific elements & Selector doesn't show the elements I need Help activities , studio	7	3328	June 1, 2019
PDF Automations (Invoices) Studio studio , question , template	5	1030	January 29, 2021

Most Active Users - Yesterday
Anil_G
ashokkarale
Ajay_Mishra
Gautham_Pattabiraman
BHUSHAN_NAGAONKAR1
vrdabberu
ABHIMANYU_THITE1
lrtetala
samantha_shah
shyamala_shyamu
More details...

Reading a HTML and extracting information

Related Topics