Extraction in Invoice Problem

Megahertz_Payment_Recpt_ST250_SRVR_01-03-2021.pdf (60.5 KB)

I can’t Extract this pdf file. i used get text activity but it extracted whole pdf, can’t able to extract particular text. so, please help me out

Give me the list of fields you want out of this document

Beneficiary Account, Amount, Payment Date, Remarks

1 Like

@badal_patel - Convert it to text file using “Read PDF Text” activity with Preserve Format set to True and then use Regex to extract the details…

Note: Attached pdf, is not an invoice…

1 Like

try to use Document understand

1 Like

:grinning:

can you please send me a flow ?

@badal_patel , Try this workflow

Test_.zip (73.5 KB)

i want to extract data from pdf and save into excel file, so i need each extract data item into variable

The fields are saved in variables in the process I sent to you

_Beneficiary
_Amount
_Payment
_Remarks

Just add each variable to a data table and write or append range to get your data into excel

@badal_patel - Please find the starter help here…Regex_BP.zip (103.4 KB)

Final Output:

flow is working but it show beneficary extra & i want to do with multiple pdf

Typpa_Payment_Recpt_101220.pdf (58.6 KB)

i want this pdf extract also. both Megahertz_Payment_Recpt_ST250_SRVR_01-03-2021.pdf (60.5 KB)

@badal_patel - Did you tested my code?? if yes, just pass the second pdf and let me know the result.

yes second code working fineTyppa_Payment_Recpt_101220.pdf (58.6 KB)

@badal_patel - My code is failing on the second pdf…because your first pdf has the word “Network” in the same line a Remarks, which is missing on the 2nd one.

First PDF
image

Second PDF

Reliable Impex_Inv_1688_Payment Recpt_10-02-2021.pdf (60.6 KB)

try this and aslo slove INR problem

@badal_patel - what is INR problem?

i dont want INR in amount in Excel file

Fixed…Here you go… Regex_BP.zip (221.3 KB)