UiPath DU process each document

Ritaman_Baral · July 25, 2024, 8:03pm

In UiPath DU Framework I can see with the classification result it is looping throught each classfiication result
What is the significance of this ?
Is there any good tutorial for the end to end understanding of Document Understanding framework ?

ashokkarale · July 25, 2024, 8:18pm

@Ritaman_Baral,

This video tutorial should be enough for learning.

Thanks,
Ashok

postwick · July 25, 2024, 8:33pm

It’s looping through the classification result because that’s how you get to the information about how the document was split up into separate documents, the pages each document is on, etc.

Here is what the classification result looks like:

ClassificationResult[3] 
{ ClassificationResult 
	{ ClassifierName="", Confidence=1, DocumentBounds=ResultsDocumentBounds 
		{ PageCount=1, StartPage=0, TextLength=746, TextStartIndex=0 }, 
		DocumentId="Realty Trust", 
		DocumentTypeId="CreditOps.APPRAISAL-ENVIRON-COLVAL.Acknowlegement", 
		OcrConfidence=0, Reference=null }, 
ClassificationResult 
	{ ClassifierName="", Confidence=1, DocumentBounds=ResultsDocumentBounds 
		{ PageCount=6, StartPage=1, TextLength=26057, TextStartIndex=748 }, 
		DocumentId="Realty Trust", 
		DocumentTypeId="CreditOps.APPRAISAL-ENVIRON-COLVAL.AppraisalReview", 
		OcrConfidence=0, Reference=null }, 
ClassificationResult 
	{ ClassifierName="", Confidence=1, DocumentBounds=ResultsDocumentBounds 
		{ PageCount=38, StartPage=7, TextLength=273902, TextStartIndex=26807 }, 
		DocumentId="Realty Trust", 
		DocumentTypeId="CreditOps.APPRAISAL-ENVIRON-COLVAL.AVM", 
		OcrConfidence=0, Reference=null } }

And here is an example of where I’ve looped through it to actually split the original PDF into separate PDFs:

Ritaman_Baral · July 25, 2024, 8:39pm

Awesome!! Thanks for the explanation…However I would like to know two things

I have a pdf of 42 pages , it is getting failed in digitize step
Suppose I was able to digitize a pdf of 20 pages and if I classificationresult(0) then also I am getting the full pdf in document

postwick · July 25, 2024, 9:39pm

I have a pdf of 42 pages , it is getting failed in digitize step

Try replacing the OCR activity with the OmniPage one. You’ll have to install the OmniPage package.

Suppose I was able to digitize a pdf of 20 pages and if I classificationresult(0) then also I am getting the full pdf in document

That’s to be expected. If classification fails to identify individual documents, of course you’ll just end up with the whole original PDF.

Ritaman_Baral · July 25, 2024, 9:46pm

in my case I out of 20 pdf page I have a single type document only !!

Ritaman_Baral · July 25, 2024, 9:50pm

My scenario is I will receive one pdf with 40 pages!!! I need to extract datat from a single page…However the single page cant be extracted with keyword check…I am using DU framework…I have removed for each and assigning classification (0) but getting the correc page…how ome it is possible ? i am dividing the pages in batches of 15 pages

postwick · July 26, 2024, 1:55pm

I’m not sure what you’re asking. If it’s just one document, then it will either classify it successfully or it won’t. If it doesn’t then you need to fix your classification so it does.

postwick · July 26, 2024, 1:55pm

Why not? You have to give more detail or nobody can help.

system · August 3, 2024, 10:45am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is there any Framework available to use Active Learning (Modern Feature of DU) AI Center question , ai_center	1	152	May 5, 2024
UiPath Document Understanding - AI Document Processing Overview Video Tutorials faq , ai_center	0	740	June 24, 2021
UIPath Acadamy DU Practice Exercise - Build an end-to-end Document Understanding Academy Courses question	0	811	June 21, 2023
Training data with uipath AI center Something Else feedback	0	788	July 21, 2021
If we have AI Center then why we use Document Understanding? Action Center question , action_center	3	1143	September 10, 2021

UiPath DU process each document

Related topics