We are in the process of implementing AP Invoice Automation using OCR and needed the below clarification:
How to deal with dynamic invoice PDFs - varying no of pages, varying number of invoice lines on each page, varying space occupied by each invoice line, etc.
Hi Divyashree - thanks for the update.
Yeah, the pdfs are scanned images, hence can you please confirm what did you mean by individual template.
Thanks!
You can use abbyy flexi capture which is a paid tool where you can create a template for each formats of invoice and that also have ICR options which help you to get handwritten data.
Divyashreem is exactly right - abbyy flexi capture is a great solution if it’s fine to create a template for each format, and it is a nice solution if you also need to process handwritten data.
One alternative to consider is using some OCR APIs that are based on AI - some of these services available in the cloud have the big advantage that they do not have to be manually configured and are template-agnostic.
There is a couple of such cloud services available - like Rossum Elis, infrrd.ai, SMACC or Abbyy Flexicapture for Invoices (a bit different from normal Flexicapture). I would personally suggest (in terms of accuracy, traction and ease of setup) is Rossum Elis, which integrates with UIPath directly and quite easily - see for an example Data extraction from invoices - Rossum and UiPath | Rossum . If you have also receipts to process, at the moment infrrd.ai may do a better job, though! SMACC might be worth a try if you need line items (though Rossum is introducing line items automation too, soon after writing this).
(Disclaimer: I’m affiliated with Rossum.)