How to extract html tags using uipath?

hi everyone,
is it possible to extract html tags?
like body,p,li,span,table tags from html file using uipath?
i need to extract only tags structure starting from tag to of the file.Please let me know the solutions/ideas to implement it.

@check_account We can read the html file using “Read Text File” activity and use regex to extract tags from the output string.

sure i will try your method.but how to convert html to text ?

You can directly read a .html file using read text file activity if you are reading from a local file, which will give you the html string.

yeah i tried this method,its pretty good to extract html file but the regex is still not working well :frowning:

Hello @check_account

It would be better if you can share the screenshot of the html source and the output which you want to extract

yeah i will share demo screenshots!

so i want to extract structure of tags like
p,h1,li tags

Hey @check_account

So you need only the tag values not the text present inside it?


Hi @check_account,

Please follow these links. It is an activity package that allows you to extract all the html tags and contents of a web page into a table.


yes you are right!, i need that html tags structure.

yeah i followed that link and i installed it,but i need to read html files that saved in my folder not from web page dude :slight_smile:

Hey @check_account

Read your page into a string, then parse the string to HTML - HTMLAgilityPack library.

Hope that helps.


hey dude!
i installed HTMLAgilityPack and how to parse it into that string?

Hey @check_account

Kindly refer this once -

Try and if you still face any issue please let me know.