Generative extraction & classification in Document Understanding Cloud APIs

We’re happy to announce that we are supporting generative extraction & classification capabilities in our Document Understanding Cloud APIs!
Do you want to process unstructured documents? Extract information based on a prompt? Process documents without setting up a dedicated specialized model?
Leverage our generative capabilities!
Available as generative extractor and classifier, with our latest release you are able to perform prompt-based classification and extraction operations via simple API calls.

Generative Classifier
To leverage the Generative Classifier, simply discover it as part of the Predefined project as displayed below:


and consume it by providing the necessary input prompts for identifying the required Document Types (either synchronous or asynchronous):

Generative Extractor
Similarly, to leverage the Generative Extractor, simply discover it as part of the Predefined project as displayed below:


and consume it by providing the necessary input prompts for the fields to be extracted (either synchronous or asynchronous):

So, what do you think? :slight_smile: Make sure to give our new features a try & let us know your thoughts!

12 Likes

How do we define the taxanomy? While using Generative Extractor?

3 Likes

The prompt/question id will be the taxonomy fields itself, I believe

3 Likes

@zell12 is right, the prompts will be the fields indeed.

@nisargkadam23 if you need a specific taxonomy, you can use the Generative Extractor that comes with IntelligentOCR and the DocumentUnderstanding.ML packages. (define taxonomy, configure gen extractor within the IOCR Data Extraction Scope, define prompts for whatever fields you need to grab using Gen Extraction).

For APIs you will be calling the endpoint for specific prompts and will be getting the answers for those prompts.

3 Likes

Does it also have action center in the workflow for human validation if required.

2 Likes

With API’s we can get the response, and pass the data to an action center through Create Validation Task activities.

4 Likes

@nisargkadam23 simply provide the prompt as input request :slight_smile:

2 Likes

What is the metering & charging for these API calls?

1 Like

@AI_GPT find details here: Document Understanding - Metering & Charging Logic

1 Like

@Monica_Secelean these are great offerings by UiPath DU and I do have a question based on the features I have tried. The scenario which I am particularly interested in is asynchronous generative extraction and would love to know if there are more examples of how this flavor of DU extraction is handled optimally through this new offering.

I am interested in knowing the configurations because asynchronous route is a complex one and even through a well trained models the confidence levels of asynchronous data extraction are quite low.

Has anyone else tried this route and played around with the features. Insights are much appreciated as we are in the process of building a POV for a use case which can use this method but only if it’s viable.

@Lahiru.Fernando or @Syed_Pasha or @zell12 any insights from your side or any other community member that can provide more pointers here. Much appreciated!

1 Like

@Sandeep_Alexander_Goni I don’t have insights about the feature usage I can share, but do reach out if you face issues :slight_smile:

2 Likes

Thank you for the response @Monica_Secelean

@Monica_Secelean, I see the licensing model has been updated for Modern and classic modern experience.

Can you outline the key differences between the Classic and Modern licensing models for the DU?

Is migrating to the Modern model recommended for all users? If so, what are the benefits and any potential considerations?

If migration is recommended, could you please provide information on the process for users to transition from the Classic to the Modern model?

Since this update impacts user access to the DU, any additional documentation outlining these changes would be extremely helpful.