Invoice comparison

Hi Team,
I have problem and need your help.
Objective - Compare several invoices with each other and report which ones are similar.
Can you please let me know how to automate this?

Thanks,
Sri

Hi @srinivas_pradeep did u need to compare with only certain values or whole the PDF ?

Hi @NIVED_NAMBIAR,

Thank you for the reply . Only certain field like invoice number, invoice date, reference number, amount.

Cheers,

Hi @srinivas_pradeep

Can u try this way

Extract the details from pdf using string manipulation and compare the data

May i also know whether it is scanned pdf or normal pdf ?

Hi @NIVED_NAMBIAR.

Thank you . To answer your question the pdf is in my local folders. how can i automate this process using Ui Path?
Cheers,

Hi @srinivas_pradeep is it a scanned type pdf ?

Hi @NIVED_NAMBIAR

Something like this.
gWvIqkT5oDJkZKBb6WMdTipw9xpoxTiCrZg[0].PDF (4.3 KB)

Hi @NIVED_NAMBIAR we can duplicate the same pdf to extract information from both to see if they match or not.?

Hi i saw ur pdf

What are reference number , invoice number here ?

Hi @NIVED_NAMBIAR,

Thanks. In this case we can consider the amount, date of expense and city.

Hi @srinivas_pradeep thanks for this question
I will tell how to extarct the above data using regex

First taking the input file

  1. To read the pdf file, i had used read pdf activity from UiPath.PDF.Activity by providing the path of path of pdf and storing the output in string variable called input

image

  1. Use matches activity to extract the date, amount and city (3 matches activity seperately)

a. date extraction

b. Amount extarction

c. City extarction

this is how u can extract the values. then u can use the validation condition to check whether values are from two pdf are same or not

Hope it helps you

Regards

Nived N

Happy Automation

1 Like

Hi @NIVED_NAMBIAR
Thank you so much. This is great. Really helpful. i have one more questions.
What if the format of the pdfs/invoices are different ? that means you can find the same field but at different positions in different pdf.

Cheers,
Sri

The regex pattern will identifiy it

May be small whitespacing issue may arrive, apart from that regex pattern works perfect

Great . Thank you. @NIVED_NAMBIAR This is in case of two pdfs. But generally we have close to 1000 pdfs. So ideally the bot has to identify the invoices which are similar but comparing against each other. is it feasible with ui-path?
Also few of the invoices are in jpg or png format.