Extract data generically from each invoice

The robot will need to go through all the invoices in the folder and extract data generically from each invoice.
Upon completion of the data extraction, it is required to send the invoice by email including the following details in the email content:
Invoice number
2. Invoice date
3. The amount of the invoice
Recommended functions to use:

  1. Analyze the digital PDF file
  2. Find - regular expression (REGEX)

Someone can help me with that.
to send me an example.
Thank you!!!

For that to read pdf use read pdf text activity

Then using matches activity to get the data by writing regular expressions

Hey,
True I know I need the use of this asset.
Can you send me an example of how to do this exactly? That will retrieve the data but each time they are somewhere else on the page.
It’s urgent for me thanks !!!

Anyone???

please help me

Hi @tamar.m

Use

  1. Use a string array varible let’s say strarr = Directory.GetFiles(folder path of invoice), this array varible stores the file path of all invoices PDF
  1. use for each activitiy with argument type as string to loop through each item in array string strarr

  2. inside the for each , use Read PDF activitiy to read each PDF by providing the file path in the property panel of read PDF and store the results in string varaible let’s say result

  3. after this use matches activitiy with a regular expression to extract invoice number , invoice date, amount of invoice each one separately from the result varible(inside for each after read PDF activitiy)

In this way you can do this

Hope it helps you

Regards

Nived N :robot:

Happy Automation :slight_smile::slight_smile:

Thank you!!!
I did the part of section 2 and 2
But I did not understand the last section,
Can you send me a sample please! How to do the regex?
Thanks

hi @tamar.m,

The UiPath academy module for Document Understanding explains the process and has practical examples that you can download and follow. it gives the exact details that @NIVED_NAMBIAR posted above.

regards
Sats

1 Like

I looked on the UI website but did not find REGEX doing it.
Can you send me an example please, for retrieving invoice data when each time the variable is somewhere else on the page.
Thanks

Hi @tamar.m

based on the invoice number , u need to create a regex paatern and extract the values

Matches activitiy helps to find the word in the string based on regex provided as input

For your reference check this workflow

Here it extract three data from the each text file
Using regex, so used matches activitiy for it separately for extraction of three values

Hope you find it useful

Mark it as solution if you got it

Nived N :robot:

Happy Automation :slight_smile::slight_smile:

I saw the exercise you sent me,
But I will explain myself I have in the folder how many files of invoices each time the values I need to retrieve are somewhere else on the page how can I catch them each time and find their value.
How do I find the word invoice number on the page itself and then find the value next to the word.
Hope I explained myself,
Thanks

  1. Invoice number
  2. Invoice date
  3. The amount of the invoice

May I know whether the invoice no pattern remind same in every file?

If yes then create a regex for that invoice number and extract it

The files have the same data:
Invoice number
2. Invoice date
3. The amount of the invoice

And they look the same but each time in a different location
You can send me REGEX how to extract the data in the best way
Thank you

Can u Share the invoice no , amount data for sample ?

This is an example of an invoice, what is marked in yellow should be pulled out, every time these conditioners are somewhere else in the invoice and I need to pull them out

Anyone?

Hi @tamar.m

I would suggest to go by this if you are allowed to go for document understanding and it uses ML extractor to read any type of Invoice and gets the data with high accuracy

But if not allowed
Before getting to steps we need to ensure what is the common keyword before the invoice number in all files
If the common keyword you have highlighted with yellow is only on few files of yours then it won’t capture from other files.

So pls clarify whether the common keyword before invoice number is same in all files
And if so we can surely handle - resolve this with Regex matches

Cheers @tamar.m

Hey,
Yes there is the same keyword in all the documents so I want to use REGEX
Can you send me a sample according to what I sent in the picture?
Thanks