How to read specific pdf text

Hello,

Please find the attached image-1

I want to extract invoice balance Due & Invoice Total
From Image -2

I want to extract Invoice Total USD & Payments Due
I tried various methods although i was stuck in middle

Thanks in advance

Hi @rsr.chandu,

Please refer below post on same.

As it’s a structured pdf, you can read pdf text, save it in a variable and use refer to extract the desired text. This also has link for how to create regex expressions.

Regards
Sonali

Hi @rsr.chandu

In order to read the text you can make use of Read pdf text (With OCR if the pdf is scanned) and store it a variable say textOutput
Check this workflow

InvoiceDue = System.Text.RegularExpressions.Regex.Match(textOutput, "(?<=Invoice Balance Due\|*\s*).[\d.]+").Value
InvoiceTotal = System.Text.RegularExpressions.Regex.Match(textOutput, "(?<=Invoice Total\s*).[\d.]+").Value

Output:


Hope you can do it for the second image

If you are working multiple pdf structures you should go for Document understanding
Hope this helps !

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.