Read amount of Invoices in Hebrew

I need have many PDF invoices from different Vendors. Most of them are written in Hebew (read from right to left).
I need somehow extract them all.
Are there any options except making regex for each type of invoice?
Many thanks

*I have many PDFs

can i have view on it a screenshot if possible
because usually pdf will be given with encoded details

Cheers @Slavich

@Palaniyappan,
sure.
Here is one of examples of required part - Invoice no. - ***

if we want to extract them as it is its possible with pdf to word conversion.
because we had such one in our daily process where we converted that pdf to word doc file and then exported the details one by one
with microsoft.office.interops.word

Cheers @Slavich

1 Like

Well i dont need to get it as it is. I need just to extract number( of invoice) as string or integer…
Maybe there is another option to do that?
Thanks @Palaniyappan

Fine
may i know how this will be done manually
so that we can repeat the same in UiPath with relevant activities

Cheers @Slavich

1 Like

There is the vendors list in excel.
Whenever i recieve new mail

  • i check if mail recieved from vendor which is in list
  • if true
    -then i download attachment (pdf invoices)
  • for each attacment: open and extract invoice number .
  • move file and rename as datetime.now+vendor+invoice number.
    Thats it.

I have done all the process already. It works through match regex. But it not stable enough because of hebrew. And also i have about 300 vendors , so i am looking for the way no to do 300 regex manually…

@Palaniyappan, any suggestions how to escape regex approach?

Hmm ok
That would be tough yes
So may I know what is the Regex used
And one more help is the term we want has any solid terms around it like heading or title

@Slavich

1 Like

@Palaniyappan,
Yes currently the Regex are used to identify invoice numbers of several vendors.
No solid terms around. Just some regularity like: invoice: || invoice number || invoice # || invoice number - || (translated from Hebrew).

Machine learning automation is applicable here?

1 Like

Fine
Machine learning automation
Yah might be let’s try that
I have seen about this quite sometimes back
May be this is that thread

Cheers @Slavich

1 Like

Thank you @Palaniyappan,
I’ll try this and update you.

1 Like

@Palaniyappan, as I learned machine learning extraction works only with English and Spanish invoices.
Other solutions are needed…

hmm…i hope thats the only option we had if word doc doesnt work upon
may be in near future we might get machine learning extraction included with this hebew as well

Cheers @Slavich