Need help in Document Understanding

akhils · June 24, 2022, 12:35pm

I am doing document understanding to extract the data from several invoices having different structure, for that we used mix of form based extractor and machine learning extractor. We made different templates for different structure invoices.
Problem1- We are finding difficultly in extracting the table items properly (Description of goods and Amount). We are getting very less confidence score, even after we used machine learning Extractor. In one pdf, out of 6 fields, it extract only3.

We have used a throw activity to throw exception if the confidence score is less than 0.9 And it will say manual intervention needed. And we will validate in present validation station.
We used TRAIN CLASSIFIER SCOPE activity to get the human validated data (used keyword based classifier trainer) and then export validated extraction results to get automatic Dataset. Then we are able to get the data in excel using append range.

Problem2- Here we used present validation station, which is not practically possible to do while in production, So we will probably remove it once we are confident to extract data. We want to extract data accurately.
Problem3- I want to Extract data only from first page of every invoice. How will that be possible, there is no page1,2 written.
Please help me solve these

Dr_Anand_Upadhyay · April 3, 2024, 7:01am

@akhils ,
Solution 1: Use AIC center for table extraction, in that go for table fields and properly take the fields and give proper training data. try to use good amount of data for training purpose.

Solution 2: The performance completely depends upon the training data and created ML pipeline so create that properly. After that taxonomy is also important.

Solution 3: In this case take training data and fields from first page only. Or you can split the pdf and UiPath and supply first page only to your ML or form based extractors.

Topic		Replies	Views
Issue in extracting table value using Document Understanding Document Understanding	1	961	October 31, 2023
Facing issues while i am using document understanding to extract data from different invoices having different structure Something Else feedback	3	618	June 24, 2022
Confidence Score - Document Understanding Document Understanding question , document_understanding , action_center	1	2057	February 6, 2021
How to train machine learning extractor in document understanding Something Else feedback	8	946	October 3, 2022
ML extractor trainer Document Understanding activities , question , document_understanding	2	440	June 22, 2023

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

Need help in Document Understanding

Related Topics