Convert invoice PDF to excel sheet

I have an invoice PDF and i need to extract various fields including invoice number, date,total invoice amount and populate it to an excel sheet.Is OCR an option to do it?
If yes, How can i use OCR in this case.
If no, what is the alternative solution?

Please provide a sample pdf.

@arun_vignesh,

Hi,

This is a readable pdf. So please use read pdf activity. You will receive the whole pdf as text (string). Then you can use Regex for extracting the data.

If you still find any difficulties, please share the sample in pdf format. I’ll get you the desired data extracted from the pdf.

Thanks
Arun
Automation make life easy:)

@arun_vignesh,
Can you ilustrate an example to apply regex in any field? I don’t know much about it.

Hi @Gopalakrishnan_K,

For Invoices Pdf Extraction go through this video for better understand. It Will helps to extract the details from invoices.

Watch this below link:

If Solution is working for you mark it as solution.

Regards,
Neelima.

1 Like

Please provide me sample in pdf format. I’ll get the details and provide you.

Thanks

@arun_vignesh,STLINV_HHIN214553-0105.pdf (12.0 KB)

Hi Gopal,

Please follow the below steps

  1. I have attached the sample xaml for data extraction Main.xaml (5.4 KB)
    2)Open the attached xaml
    3)Manage packages and install UIpath.pdf.activities
  2. For example : consider you want to get Invoice date, So you should find the data in between which the invoice date is available
    image
    In this case the “01-06-2019” is available between “Invoice Date” and “DUE DATE”.
    So mention,
    the initial word as “Invoice date” and final word as “DUE DATE” in the arguments as shown below

    You will get the results.

I hope I solved your doubts. If yes please resolve this issue. Please let me know if still you have any issues.

Thanks,
Arun
Automation makes lifes easier:)

@arun_vignesh,
Can you post a screenshot of the xaml file.For me, 1st activity is shown as unresolved activity.

@arun_vignesh, How to apply regex in the PDF? Please reply

To apply regex you should convert the pdf to string using read pdf activity. Could you please confirm if you get chance to download the PDF activity from uipath manage packages?

Regards,
Arun

Hi @arun_vignesh,
Yes , PDF activity is installed and I have saved the complete PDF data to a text output. Now from this text, i need to extract a pattern of text alone.(like invoice number, invoice date,etc…). What should i do further for this

Hi bro,

In the xaml, I have a regex pattern.please use the same and create two variable.Follow the steps which I mentioned above in my 4th reply(the one with screenshots and xaml)

code
Attaching the output screenshot.
Please check xaml code

@arun_vignesh,
please reply

Please send the xaml which you have created.

@arun_vignesh,
Main (2).xaml (5.5 KB
This is the file)

@arun_vignesh,
please reply

Hi @Gopalakrishnan_K,

Ar_result is array of string from which we need the 0th string. So mention as data = Ar_Result(0).tostring
image

Super Thanks and appologies for the delay,
Arun Vignesh S
He who serves the most, reaps the most.