How to extract html tags using uipath?

hi everyone,
is it possible to extract html tags?
like body,p,li,span,table tags from html file using uipath?
i need to extract only tags structure starting from tag to of the file.Please let me know the solutions/ideas to implement it.

@check_account We can read the html file using “Read Text File” activity and use regex to extract tags from the output string.

sure i will try your method.but how to convert html to text ?

You can directly read a .html file using read text file activity if you are reading from a local file, which will give you the html string.

yeah i tried this method,its pretty good to extract html file but the regex is still not working well :frowning:

Hello @check_account

It would be better if you can share the screenshot of the html source and the output which you want to extract

yeah i will share demo screenshots!



so i want to extract structure of tags like
p,h1,li tags

1 Like

Hey @check_account

So you need only the tag values not the text present inside it?

Thanks
#nK

Hi @check_account,

Please follow these links. It is an activity package that allows you to extract all the html tags and contents of a web page into a table.

Regards,
MY

yes you are right!, i need that html tags structure.

yeah i followed that link and i installed it,but i need to read html files that saved in my folder not from web page dude :slight_smile:

Hey @check_account

Read your page into a string, then parse the string to HTML - HTMLAgilityPack library.

Hope that helps.

Thanks
#nK

hey dude!
i installed HTMLAgilityPack and how to parse it into that string?

1 Like

Hey @check_account

Kindly refer this once - https://html-agility-pack.net/

Try and if you still face any issue please let me know.

Thanks
#nK