PDF - Find and Save invoice's number

Dear all,

I have received some invoices in PDF format which including Vendor number and Invoice no.

May i know how i can search and save it in a data table?

eg:
Vendor Code: 00001
Invoice No.: 12345

Thanks so much

Hello,

First read pdf that will give us string variable then use substring() method.
or
can you attach your pdf?

Cheers,
Pankaj

Dear Pankaj,

Sorry that as a new user i am not allowed to upload a file.
The invoice look like that:
image

@shing3322 First read the pdf file read pdf text activity. Then try statements and let me know.

String Vendor = System.Text.RegularExpressions.regex.Match(StrVariable,"(?<=Vendor Code: ).+").ToString

String Invoice = System.Text.RegularExpressions.regex.Match(StrVariable,"(?<=Invoice No: ).+").ToString

StrVariable is variable of String Type which is the output of Read PDF Text activity.

1 Like

No problem,

Please find below example, you will have to modify few things in my code,

  1. Pdf,file path,
  2. Selector of activities
  3. excel file path

FindInvoiceNoSave.xaml (24.7 KB)

Cheers,
Pankaj

1 Like

Hello.

  1. Use ‘Find OCR Text’ activity “Vendor Code” and out element variable named like “vendorUi”.
  2. Use ‘Set clipping region’ activity and set Direction to ‘Right’ and input element is “vendorUi” variable.
  3. Use ‘Get OCR Text’ activity and input element is “vendorUi” variable and out text variable named like “vendorCode”.
  4. Output message is “Vendor Code: 00001”.

I Think it’s the best way to recognize fixed text orientation in OCR.

This site can’t upload some files for new user. So I can’t upload sample xaml file. :disappointed_relieved:

Thanks.

1 Like

@Manjuts90, thanks so much, i can get the Vendor number correctly, but not the invoice no.

I try to copy the PDF to Notepad, and find the actual format/pattern of the PDF is like the following:

Invoice
Vendor Code: 00001
: 11111
of 1
Invoice No.
Page 1
.
.
.
.

How could i get the Invoice No. correctly at this situation?

Thanks so much.

@shing3322 For invoice number try like below.

String invoiceNo = Split(StrVariable,“Vendor Code:”)(1).Split({Environment.NewLine},StringSplitOptions.None)(1).Replace(":","").Trim

This will for current input as shown in screenshot, try let me know.

1 Like

@Manjuts90, thanks so much!! it works perfectly.

1 Like

@shing3322 If u got answer close thread by marking the solution.