Generative Extraction & Classification using Document Understanding in cross-platform projects - Public Preview

@balaraman.ramiya updated the note to make it clearer, hope it’s better now :slight_smile:

@islam.spaho very strange, thanks for reporting, looks like a small bug - we will try to reproduce it/check what’s wrong - if you can think of any tips of us, do let us know :slight_smile:

@islam.spaho we cannot reproduce the issue :frowning: would you mind scheduling a call with me, to go through your issue? please fw me a meeting invite at monica.secelean@uipath.com

May I know if is there a way to find out the question (i.e., value) in the Classify Document Activity?
What is the name of the dictionary holding the Key-Value pair in the Classify Document Activity?

Hi @Monica_Secelean ,

May I ask what is the suggested prompt for table processing?

I used below prompts and all result in empty value.

1.) Extract all data in table including header in json format
2.) Extract all data in table including header in json format (Qty, Description, Unit Price, Amount)
3.) What are those value in table

Hi,

This is an exciting package, having some real fun with it in studio. However, when trying to test it out unattended on some data, with the older DU ML activities pack also installed on the latest preview, I reliably get this error:

Could not load file or assembly ‘UiPath.DocumentUnderstanding.Persistence, Version=6.12.0.0, Culture=neutral, PublicKeyToken=null’. The system cannot find the file specified.

System.IO.FileNotFoundException: Could not load file or assembly ‘UiPath.DocumentUnderstanding.Persistence, Version=6.12.0.0, Culture=neutral, PublicKeyToken=null’. The system cannot find the file specified. at UiPath.IntelligentOCR.Activities.CreateDocumentClassificationAction…ctor()

Unless Iremove the new DU 2.3.1 package, in which case it works fine.

Is this intended?

@ababab2828 not for the moment - what would be your use case/why would you need it?

We don’t yet have a way for handling tables - for this particular use case, we currently recommend using specialized models.

@mlellison maybe you try modifying the prompts to questions, like: Can you extract the table data in a json format? I can’t tell for sure that it will work, but worth giving a try (truthfully, the solution works for some tables, while it fails for others)

@g.ward we recommend the DocumentUnderstanding package to be used on its own - not in combination with the IntelligentOCR & ML package; that said, it shouldn’t actually break the workflow if used together - it’s just that the framework of the 2 differs very much and we plan investing mostly in the DocumentUnderstanding package.
Maybe try a workflow using only the DocumentUnderstanding package? or is there anything stopping you from (missing feature, probably, as we work towards adding those!)
Looking forward to hearing from you! :slight_smile:

Perhaps to find a way to document the question that generates the document classification for a specific Document Type. This will help to look back from the output panel on how to improve the question for classifying that type of document if the result does not meet the expectation when processing a stream of different types of documents. Otherwise, one has to go back to the Classify activity to check into the prompt.

Thanks @Monica_Secelean

May I ask where can we find updates about the fix on table processing?

For those who are interested in table extraction, I have followed below video and able to extract table data (The prompt gives good accuracy, just sometime missing the last datarow)

2 Likes

Hi Monica,

Thanks for the reply.

You are correct, its to augment an existing old process. I’ll look into how easy it would be to convert the old stuff into the new package.

In terms of other features, will the classifier ever return multiple results? e.g. if it was 80% sure on one category and 60% sure on another? Alternatively, could it return page by page results for when we have multiple documents stitched together? Currently we have a nasty pdf split algorithm to deal with such things.

Thanks,

Gareth

@ababab2828 you are right in the sense that, you would need to test & modify your prompts until the proper results are achieved. If you want some reference to the prompts, would you mind saving them as variables and provide the variables as prompt input to the Classify Document Activity?
Something like:

Document Type Name: Invoice
Prompt: invoicePrompt
where invoicePrompt is a variable where the prompt is persisted

While you cannot do this yet, we can think about enabling it if it helps your use case - what do you think?

@mlellison there are no updates with regards to the activities - I was just suggesting you use different prompts and see how it works, because officially, we don’t yet provide table support for generative extraction - although we are looking into it :slight_smile: for the moment, we recommend using specialized models for table extraction

Hi @g.ward - we are looking into how we can provide a migration path from the old to the new package - but as the new package is still catching up on feature parity, it will take us a while until we can do so :slight_smile:

For the moment, the classifier returns one result only - but we are looking at providing splitting capabilities to it soon :crossed_fingers:
I’m sorry I don’t have better news - we are working on cool stuff though and will be happy to update you once we’re ready to release ! :rocket:Till then, please keep the feedback coming - it helps shape the product :star:

@islam.spaho any idea, are you using a community or an enterprise account?

This sounds good and will definitely help!

This is good. However organizations are concerned about security aspect of using Gen AI. Is it safe for organization data.

Trying to use the extract and when running in debug I get the following error

Extract Document Data: Unable to cast object of type ‘UiPath.IntelligentOCR.StudioWeb.Activities.DataExtraction.DocumentData1[UiPath.IntelligentOCR.StudioWeb.Activities.SWEntities.CustomGptDocumentTypeC737D34CB07B4D25AceeB7E566F41F35.Bundle.CustomGptDocumentTypeC737D34CB07B4D25AceeB7E566F41F35]' to type 'UiPath.IntelligentOCR.StudioWeb.Activities.DataExtraction.IDocumentData1[UiPath.IntelligentOCR.StudioWeb.Activities.SWEntities.CustomGptDocumentType7D89Cc4075Fc4A929517Fdb1E3Bce64D.Bundle.CustomGptDocumentType7D89Cc4075Fc4A929517Fdb1E3Bce64D]’.

This is developed in Studio (Desktop) and not sure where I am going wrong with this?