Need help to put me on the right path towards the best solution for this:
We need to extract a per item data (Item Name/Description/Number, Amount) from hundreds of different scanned invoices (images) with varying structures.
Maybe just an advise on the general overview of the workflow and the UiPath technologies to be used would be a great help already. If there are also advises for external solutions that can be purchased, we are also open to that option (although I would like to try doing it on my own first to understand it better).
For this kind of scenarios, you may need to go with Document understanding a capability of UiPath to understand documents and extract information from it.
There is a framework template for the same to perform the automation with ease including transaction management, human in the loop, exception handling.
For your scenarios, I guess Document Understanding is the right one since it has the capability to extract the data from different documents (scanned, .png, .tiff etc) using AI capabilities. Also, in DU there are different extractors available to extract the data from the docs
Intelligent Form Extractor : It is suitable until and unless the layout of the document and alignment of the data remains same. It is also capable to identify signature and handwritten fields
Form Extractor : It is similar to the above one but it doesn’t have the capability to identify handwritten or signature fields
Regex based Extractor : It is suitable for small use cases
ML Extractor : It is the most advanced one and is more suitable for the documents coming with higher volume or most unstructured formats
Hi @kaze , As already mentioned, We would Suggest for you to use Document Understanding Capabilities available with UiPath.
The scenario you have described, having different Varying Structural Documents, should be achievable more accurately using Machine Learning Models available in AI Center.
Narrowing down, the Document Understanding Model should be the Most suited for your case.
Unless, you have a definite number of Templates of the Documents that are to be processed, there might not be a better option.
However, if that is also the case, we could also either Integrate Abby Flexicapture or Form Extractor into your solution