UiPath DU process each document

In UiPath DU Framework I can see with the classification result it is looping throught each classfiication result
What is the significance of this ?
Is there any good tutorial for the end to end understanding of Document Understanding framework ?

@Ritaman_Baral,

This video tutorial should be enough for learning.

Thanks,
Ashok :slight_smile:

It’s looping through the classification result because that’s how you get to the information about how the document was split up into separate documents, the pages each document is on, etc.

Here is what the classification result looks like:

ClassificationResult[3] 
{ ClassificationResult 
	{ ClassifierName="", Confidence=1, DocumentBounds=ResultsDocumentBounds 
		{ PageCount=1, StartPage=0, TextLength=746, TextStartIndex=0 }, 
		DocumentId="Realty Trust", 
		DocumentTypeId="CreditOps.APPRAISAL-ENVIRON-COLVAL.Acknowlegement", 
		OcrConfidence=0, Reference=null }, 
ClassificationResult 
	{ ClassifierName="", Confidence=1, DocumentBounds=ResultsDocumentBounds 
		{ PageCount=6, StartPage=1, TextLength=26057, TextStartIndex=748 }, 
		DocumentId="Realty Trust", 
		DocumentTypeId="CreditOps.APPRAISAL-ENVIRON-COLVAL.AppraisalReview", 
		OcrConfidence=0, Reference=null }, 
ClassificationResult 
	{ ClassifierName="", Confidence=1, DocumentBounds=ResultsDocumentBounds 
		{ PageCount=38, StartPage=7, TextLength=273902, TextStartIndex=26807 }, 
		DocumentId="Realty Trust", 
		DocumentTypeId="CreditOps.APPRAISAL-ENVIRON-COLVAL.AVM", 
		OcrConfidence=0, Reference=null } }

And here is an example of where I’ve looped through it to actually split the original PDF into separate PDFs:

image

1 Like

Awesome!! Thanks for the explanation…However I would like to know two things

  1. I have a pdf of 42 pages , it is getting failed in digitize step
  2. Suppose I was able to digitize a pdf of 20 pages and if I classificationresult(0) then also I am getting the full pdf in document
  1. I have a pdf of 42 pages , it is getting failed in digitize step

Try replacing the OCR activity with the OmniPage one. You’ll have to install the OmniPage package.

  1. Suppose I was able to digitize a pdf of 20 pages and if I classificationresult(0) then also I am getting the full pdf in document

That’s to be expected. If classification fails to identify individual documents, of course you’ll just end up with the whole original PDF.

in my case I out of 20 pdf page I have a single type document only !!

My scenario is I will receive one pdf with 40 pages!!! I need to extract datat from a single page…However the single page cant be extracted with keyword check…I am using DU framework…I have removed for each and assigning classification (0) but getting the correc page…how ome it is possible ? i am dividing the pages in batches of 15 pages

I’m not sure what you’re asking. If it’s just one document, then it will either classify it successfully or it won’t. If it doesn’t then you need to fix your classification so it does.

Why not? You have to give more detail or nobody can help.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.