How to highlight found text in pdf

I have a pdf document in which I have to search for some text and take a screenshot of found text in the document… If the text is not found in pdf then move to another word…

this is my pdf doc:
Apple Inc_10KProof.pdf (1.3 MB)

In this pdf doc, I want to search keywords like cheat, fraud and gambling… if the any of the keyword found in the doc for ex. fraud is found in the doc then i have to take a screenshot like this:

Hi,
Here we go.:slight_smile:
Searching for word “Fraud”
taking the screenshot
saving the image.

PS:Make sure PDF Open in foreground before running the code.
PDFHighLight.zip (828.2 KB)

2 Likes

Read PDF activity which doesn’t required to open the PDF. Just pass the PDF path with file name will read required text and give output.

Dear @rkelchuri may I know how you gonna take screen shot without opening PDF .:roll_eyes:
Read PDF activity used to get the text by doing string manipulation .
In this case you can’t use that .

Oh yea correct, to take screenshot we need to active PDF and take the screenshot. Read PDF only read the text inside the PDF.

But still its not a bad idea. Read Pdf text, if it matches criteria open it and screenshot. Depending on ratio of hits, it could also save some time.

I was wondering after matching how’d you find that word in the pdf besides using ctl+f and then take screenshot? Is there any other way around.? :slight_smile:

thank you :slight_smile: @ddpadil

Hi @ddpadil if you see the screenshots, the last 6 and 7 are same… If I apply this to other pdf doc also the last two are coming same… why?

Because 7th was the last match and even before pdf reader show the message saying “there no match found” its taking screenshot of it.
You can keep some delay between or set some delay time in WaitBefore the property of take screenshot activity.

1 Like

Take that entire PDF into text. after that take that text into an array. Take one if condition and check whether that array contains cheat, fraud and gambling if it found then take screen shots else no

Array is also not required directly u can use if condition

Hi @Bachali,

Yes you are right,
Using Read Pdf text activity ouput is string
In that string you can use contains method to find the string is available or not.

Regards,
Arivu

Hi @ddpadil,
but this only works if the pdf is text or? If it is an image it can´t find it?

Hi @dominik.breer,

Yes I tried with word

Hi guys,

I have built a sample bot which highlights the specific field in the Scanned pdf and saves the picture of it.

highlighttext.xaml (6.8 KB)