Automatic (Sales) Order Entry received via PDF

Dear UiPath Community,

we are currently looking into automatic our sales order entry. That means we receive requests from our customers via PDF including information such as:

  • Sold To (Address)
  • Ship To (Address)
  • Purchase Order No (Number)
  • Quantity (Number)
  • Material (Text)

As we have various customers these PDFs are not the same. So we have pure PDFs, we have image based PDFs and the position of the text therefore also is not the same.

I was curious, as I already saw invoice automation, which I though is the same - but haven’t seen anything for this.

What would be your approach?

Work with Read PDF / OCR to capture the information? If yes, how do you deal with languages such as Englisch, German & Dutch? And is there any possibility to “teach” the robot overtime?

PS: I would be looking for a solution that can be done within UiPath but open to other suggestions.

Many thanks in advance & keep on rockin’,
Freddy

You could look at using ABBYY felxicapture and it’s template based OCR but this limits you in that you would need to define a template for every customer and thus every new customer as well.

To deal with non-standard input types like this you’re generally best of looking at some other service to interpret the inputs and then use that as a standardised input for uipath.

An example of this would be Invoice OCR Software: Invoice Scanning & Data Extraction API | Xtracta
This was literally just the first result I got from a search, so I’m not endorsing this company but something like them will likely help you.

1 Like

Thank you for that Idea! ABBYY would be additional licenses, correct?

Did anybody try to define templates within UiPath - simply for text readable PDFs and wants to share some experiences or challenges?

Well first off you would probably need to capture and define all possible outcomes/vendors that you are receiving Invoices from. I don’t know about machine learning enough to talk about it, but it is an interesting topic.

What I have been using is that i open the PDF, search for some image that is unique for that kind of Invoice, and after it is recognized I mostly relied on String manipulation and Regex to get the results that I’ve wanted.