Is there a way to extract invoice information like Invoice No, PO No, Address, Amount etc… from multiple vendors with multiple invoices patterns. here data is not standard & data positions also vary from one invoice to another.
Which is the best way to achieve this problem?
AI & Machine Learning
If anyone goes through this kind of real-world scenarios, please advice.
Answer to your question:
If the number of invoice types can be counted. Assume 10 different, then Regex is better suited with some if conditions to support which Regex expression to use.
If the number of invoice types are many and the volume of invoices are also large (more than 100) then I would advice to go with a tool which helps in extracting values based on machine vision.
Firstly, I suggest you try out Rossum. They are the leaders in this space (https://rossum.ai/). They are the ones every PDF extracter wants to beat currently.
Second, I would try AI Builder from Microsoft as well.
UiPath was quite late in implementing Document Understanding and are still catching up in this space, but the way they integrate with RPA robots makes it interesting. There is a course on UiPath academy and ample number of YouTube videos on how to get started with UiPath’s Document Understanding. I myself am new to this offering by UiPath.
Nonetheless every intelligent document parser today have very good API documentation, which you can use to build custom integrations in your UiPath Robot.