Overview
Note: As with all preview features in Studio Web, this functionality is only available for Community accounts and not accessible for Enterprise - to try it out without a community account, feel free to download the
DocumentUnderstanding.Activities 2.3.1-preview
package in Studio.
We understand that classifying and extracting information from diverse document types can be challenging and time-consuming, especially when dealing with custom or unstructured use cases. Therefore, we are excited to introduce generative capabilities in our Document Understanding Activities package - having the Classify Document and Extract Document Data Activities now empower users to input prompts based on which to either classify the document or extract data from it, making the process more intuitive, efficient, and flexible!
In this way, when working with either activity, users have the option to select the Predefined Project and the âGenerative Classifierâ or âGenerative Extractorâ as model to work with - requiring the input of key-value pairs, where the user can provide his prompt as input as sampled below:
The prompt will then be sent to a large language model, together with data from the document, to classify or extract the required information which will then be consumed in the workflow.
Note: At this time the generative model is not retrainable and as always, all data handling adheres to our standard terms of service.
Documentation
How to Get Started
Pre-requisite: DocumentUnderstanding.Activities - min. version 2.3 package
Simply create your cross-platform workflow in your preferred Studio environment and when using either Classify Document or Extract Document Data:
- Select the Predefined project
- Select the Generative Classifier or Extractor
- Provide your prompt as key value pairs, where:
- key will be the Document Type (e.g. CV) or Field Name (e.g. email)
- value will be a description for determining either of them (e.g., CV containing candidate skills and experience or email address of the document)
When running your workflow if using the Validation Station, one can see why the extractor has selected a particular answer for a field.
Limitations
Table processing may not always lead to the best results - weâre working on fixing this, so if you encounter issues, please shout out
Charging
Charging will happen based on AI Units - we donât have this finalized yet, but we will update you here, once we have all details in place.
Do reach out if you give Generative Classification or Extraction capabilities a try and let us know how itâs going! What are we missing? What would you like to see further? Looking forward to your thoughts!