Fetching some contents from web page(HTML page)

HI,

I have HTML web page,i want to fetch some conten from it.

1.warning letter (number besides it)
2.country name(which is in Address )
3.search some keywords in it (like Data integrity ,Computer system,Failure to validate)
4.bullets (present in bold letters )

below are the hyperlinks ,which has all the content as mentioned above.

https://www.fda.gov//ICECI/EnforcementActions/WarningLetters/ucm607820.htm
https://www.fda.gov/ICECI/EnforcementActions/WarningLetters/ucm598585.htm

https://www.fda.gov//ICECI/EnforcementActions/WarningLetters/ucm617355.htm
https://www.fda.gov//ICECI/EnforcementActions/WarningLetters/ucm617355.htm

Problem is,html pages are not in proper structure and not even in tr td tag format.all contents are in div tags.Hence none of the option is working like selectors or UI explorer.

Can someone please guide me on this,what could be the simplest way to achieve all this contents.

Thanks in advanced :slight_smile:

Hi,

I did that using screen scraping(Full text and it is working fine for me). The only thing you need to do is replace all the space and other irrelevant information using String Manipulation or you can use regex.

Hope it helps.

Thanks!
Anmol

Can you provide your example please ?

Give me some time. I will create the sample workflow and send it to you.