Active Learning in Document Understanding Public Preview

We’re excited to announce the Public Preview of our new Active Learning experience.

What is Active Learning?

Active Learning is our modern approach to creating models for Document Understanding. It involves a collaborative process between annotators and the model that is being trained. This process helps to reduce the time and data required to train a machine-learning model by up to 80%. AI is used to guide the process, which includes automatic annotation, which is typically the most time-consuming task. The model also provides expert recommendations to enhance accuracy using the most informative datasets.

Additionally, Active Learning offers analytical capabilities for monitoring automations. The new experience is organized into four components, which help guide you in building and consuming a high-performing model:

Component Description
Build Automatically classify and annotate your documents, build AI models and achieve high accuracy with active learning recommendations.
Measure Optimize your model performance with efficient training and guidance.
Publish Publish the projects and use them in either RPA or API automations.
Monitor Monitor the performance of your automation and easily drill down into fine-grained document-level details​.​

Pre-requisites

  1. Automation Cloud account with the following products enabled
  • Document Understanding
  • AI Center
  • Insights (this is only required if you would like to test the Project Performance part of Monitor)
  1. To consume Active Learning projects in Studio Web:
  • Enable Public Preview feeds for Studio Web. Go to Automation Ops → Governance → Policies. If a Studio Web policy is present, turn on the Library feed. If not, create a new Studio Web policy and ensure preview packages are active. For the changes to take effect, log out and then log back into your Automation Cloud account.

To make sure the policy has been applied follow the steps below:

How to Get Started

To join the Public Preview, navigate to Document Understanding in your Automation Cloud account and create a Modern Project.

To provide feedback, join our Insider Portal Public Preview and submit a ticket. This ensures we track and act on your feedback by pushing your tickets into our internal system.

Pre-requisites

  1. Automation Cloud account with the following products enabled
  2. Document Understanding
  3. AI Center
  4. Insights (this is only required if you would like to test the Project Performance part of Monitor)
  5. A sample set of documents

User guide

Create a new project

Navigate to Document Understanding in your Automation Cloud account. If Document Understanding is not enabled, visit your Admin page and enable both Document Understanding and AI Center for your tenant. Upon accessing Document Understanding, select “Create Project” and choose the “Modern Preview” experience.

After creating your “Modern” project, you have two options:

  1. Upload your documents by dragging and dropping them for automatic classification. Document Understanding will attempt to organize your documents into respective types.
  2. Manually create your document type using the “Add Document Type” button, then proceed to upload your documents into the specified category.

Build your classification and extraction models

Once your document types have been created and all of your documents have been uploaded, you can:

  1. Build a classifier: Follow the Classification Recommendations to train a document classifier
  2. Build an extractor: Choose a document type to work on, click on Annotate, and follow the Recommendations to train an extraction model

If your uploaded documents belong to one of the 30+ OOB document types they will be automatically pre-annotated when you upload them. All you need to do is to validate all suggested annotations that are correct. You can do this with each field individually or with all fields using the “Confirm” button. If any of the suggested annotations are incorrect you can correct them manually. If you want to use a custom document type, create a custom document type, upload a sample document, add all the required fields, and upload the rest of the documents. When the rest of the documents are uploaded they should all be prelabeled.

To build an extraction model, follow the Recommendations shown in the upper right-hand corner of the screen.

Measure the performance of your classification and extraction models

Navigate to the Measure module to:

  1. Evaluate your classification and extraction models
  2. See which Recommendations you should perform to improve your models

Publish your project

Navigate to Publish and click on the "Create project version button to create a project version. Once a version has been created click on deploy to deploy your project.

Consume your project

Currently, there are 2 ways to consume your projects built using the Active Learning experience:

  1. Using the new cross-platform Activities in Studio Web or Studio Desktop
  2. By consuming our Document Understanding Cloud APIs

For Studio Web and Studio Desktop. click on the “Open Studio” button in the button right-hand corner. This option is only available for Community Accounts - we are working towards adding this experience for Enterprise Accounts soon!

After clicking on the button, you will have a cross-platform workflow created using our latest set of Document Understanding Activities. You can easily run and customize the workflow as per your requirements. The workflow will refer to your newly created Document Understanding project, and it is equipped with classification capabilities (if available). It is all set to be executed for the sample files that were uploaded earlier.

For Document Understanding Cloud APIs click on the “Rest APIs” button, select Framework and follow the public documentation on how to consume Document Understanding models using APIs:

Monitor the performance of your automations

  1. After consuming the published project via RPA workflows or APIs data will be generated for the Project Performance dashboard.
  2. After consuming the published project via RPA workflows or APIs data will be generated for the Document Audit section.

Known limitations

8 Likes

@andras.palfi - It’s been very informative!

1 Like

Very insightful @andras.palfi . Can’t wait to start using this on my current project!

1 Like

Hi @andras.palfi,
Is there any framework available to write UiPath code for this Modern Active Learning DU approach?

1 Like

hi @andras.palfi , how can we train the dataset in Active learning DU .
2. In DU UiPath framework provided, we have train extractor and train classifier which point to local file and dataset endpoints on AI center or local system which re-trains the dataset.
3. Can you please guide how can we re-train the dataset in modern learning DU.

1 Like