Pick elements from Multiple invoice with different format

try to use Regular expression
there is lot of option available in the uipath
now i’m giving a example using regex.


@bagishojha Thank you … I will try

Hi @Sailaja_Chikkam,

The invoices are in the pdf file or in Excel file ?


PDF file @balupad14

you have 2 options

1. Read the text from the pdf and do the manipulation on it to get the data.

2.Search the text in the pdf (I am working on it, I will let you once finished.)


Yes… here searching the text in PDF (text can be Product/DESC/Description/Description of Product/Description of work)… Can be anything because each invoice is different from other but the meaning is same for everything. @balupad14

yes @Sailaja_Chikkam. You are right . we need to create a configuration object (may be in xml or excel) . based on that we need to search the text in the pdf. One thing is an important. The pdf quality should be in good even if it is a scanned one. Because the bad scanned quality will return the junk characters.


How to create configuration object in Excel? @balupad14

Hi @Sailaja_Chikkam

Like this


How do you search the text in the pdf?


Find Text Exists ? Right? @balupad14

When you have a distinct pdf format file then you should use switch activity with relative title(invoice title) and do the data manipulation with in separate workflow.It would be easy and feasible to extract required information from the different pdf file.

How do we use switch activity in UIpath . I tried but i am not able to get . Can u please guide me on this ?

Hi @Sailaja_Chikkam,

Please check this with example…


Hi Sailaja - what is your experience with the success rate of the regular expressions so far?

In case you have a variety of formats, another approach is to plug in a cognitive API component; an AI-based solution has the advantage that it does not have to be manually configured and is template-agnostic. You could then only use the regular expressions in case the AI gets wrong some invoices you are receiving frequently.

There is a couple of such services available - e.g. Abbyy Flexicapture for Invoices, infrrd.ai or smacc.
I think the best option though is Rossum Elis, which integrates with UIPath directly and quite easily - see for an example https://rossum.ai/blog/2018/07/30/automating-data-extraction-from-invoices-using-rossum-api-and-uipath/ . (In your case, it also depends on whether you actually need the line items or that was just an example - Rossum still has line items on roadmap as of writing this and current API automatically extracts just header fields.)

(Disclaimer: I’m affiliated with Rossum.)

Hi @PetrBaudis

For Rossum Elis, does it upload invoices to their server and stored there to be processed? Reason I ask is there might be security issues. Company might not want their invoices to be stored in another server.

You are right, @SR.H - the invoices are processed server-side in the cloud - of course secure and GDPR compliant. Just as it is common for invoice processing to be outsourced even if it is human-based, the security tradeoffs are the same (or actually higher security with services like Rossum because no human outside the client company typically sees the invoice). Then it becomes a tradeoff between accuracy, cost savings and confidentiality.

Try to manipulate using constant type string or value in your invoice. You need to specifies a particular constant value who always live same in every format in invoice.

and then you need to do some string manipulation like:-
indexof and substring and then after you can manipulate your all invoice with single code.

hope you can understand what i want to say.