Extract the data from pdf

himanshur · July 20, 2021, 7:20am

Is there a way to extract invoice information like Invoice No, PO No, Address, Amount etc… from multiple vendors with multiple invoices patterns. here data is not standard & data positions also vary from one invoice to another.

Which is the best way to achieve this problem?

Regx
AI & Machine Learning

If anyone goes through this kind of real-world scenarios, please advice.

Palaniyappan · July 20, 2021, 7:25am

Hi

Being witu unstable position I would suggest to go with both of them that is
—Ai ML
— Regex
as a combined one and it’s possible with document understanding

So pls go ahead with document understanding
For more details

Cheers @himanshur

jeevith · July 20, 2021, 7:33am

Hi @himanshur,

Answer to your question:
If the number of invoice types can be counted. Assume 10 different, then Regex is better suited with some if conditions to support which Regex expression to use.

If the number of invoice types are many and the volume of invoices are also large (more than 100) then I would advice to go with a tool which helps in extracting values based on machine vision.

My suggestion:
Firstly, I suggest you try out Rossum. They are the leaders in this space (https://rossum.ai/). They are the ones every PDF extracter wants to beat currently.
Second, I would try AI Builder from Microsoft as well.

UiPath was quite late in implementing Document Understanding and are still catching up in this space, but the way they integrate with RPA robots makes it interesting. There is a course on UiPath academy and ample number of YouTube videos on how to get started with UiPath’s Document Understanding. I myself am new to this offering by UiPath.

Nonetheless every intelligent document parser today have very good API documentation, which you can use to build custom integrations in your UiPath Robot.

Hope this helps you brainstrom more.

Topic		Replies	Views
How can I extract data from different invoice types? Document Understanding question	3	1059	January 2, 2023
How to extract invoice data from PDF's? Help pdf , ocr	10	7369	February 24, 2020
Extract Data from one PDF file containing Multiple pages of Invoices Studio excel , database , pdf , activities , studio , question , ml , ai_center , tools	2	3211	April 11, 2022
Best solution for reading scanned invoices with hundreds of different structures Studio uiautomation , ocr , intelligent_ocr , invoices	6	1636	June 22, 2022
How to extract required information's from different type of PDF invoices? Activities ocr , activities , question	3	907	July 8, 2021

Extract the data from pdf

Related topics