Extracting specific elements from scanned pdf's

Hi , i would appreciate help.

I need to extract specific text from a scanned pdf and paste the results in a single excel, however there are around 1000+ documents, is there a way to loop and open every pdf document if you using “get OCR text” and get all the results in an excel sheet?

Thanks in advance!!

Hi @Trisha, check this post below :slight_smile:

Hope it will help you .

~Diego Turati

Hi, thanks for the help.

Im currently working on opening a pdf( different pdf: invoice, receipt etc…) file, then reading the heading of the file (scanned image), if heading says “Invoice” , i want to close that pdf and move it to a separate folder. Is this possible ? if so please could i have an example.

Thanks in advance!

I have a flow that opens each pdf then reads the heading and stores the output in a string, however my if condition is not working…

image

Hi this problem is happening because you’re using the string type , but the output is an array os strings."System.String [’ '] " . Check what’s the output of for each loop.

Hi @Trisha, check this out :

~Diego Turati