Document Understanding Cloud APIs (previously referred to as "as a service") - GA

Overview

We’re happy to announce that now, you can consume Document Understanding not only via robots using RPA - but also via APIs hosted on the cloud :partly_sunny: They provide a means to consume all skills available (as pre-trained) or built (for custom Document Types via labelling sessions) in Document Understanding, enabling a runtime experience through various programming languages.

In this sense, we announce the launch of Document Understanding Cloud APIs, which will allow you to consume the framework the same way you would via RPA, providing:

  • Discovery APIs - allowing consumers to access to the available resources (projects, document types, classifiers, extractors) used for the Document Understanding Framework, as displayed below:

  • Digitization APIs - providing a digitization method - called as a first step, responding with a documentId, which will be referenced by other operations; and a method for retrieving the corresponding result, if required!

  • Classification APIs - allowing you to consume classification models for identifying the Document Type of the input document (similar as the Machine Learning Classifier enables classification via RPA)

  • Extraction APIs - allowing you to consume extraction models, for retrieving the fields of the Document Type processed by the extractor (similar as the Machine Learning Extractor provides this capability via RPA)

  • Validation APIs - allowing you to create Validation Tasks in Action Center, leveraging both the Classification or the Validation Station depending on users’ needs.

Classification & Extraction APIs are available for both synchronous (for documents up to 5 pages) as well as asynchronous (posting the request via a start method and retrieving the result via polling) consumption, to provide support for various use cases: be it optimizing for performance or processing of large documents.

The service is discoverable via a Swagger interface which can be accessed from Document Understanding in Automation Cloud.

Official Documentation

Demo⭐️

How to Get Started

For consuming the APIs we recommend you start from the Swagger specification & give them a try, before implementation. In the future, we also plan on offering SDKs for various programming languages - nevertheless, swagger should provide all required information. Trying out the APIs is easy as 1-2-3 :relieved:

1. Access Swagger

Within your Automation Cloud account, access the Document Understanding center and click the REST APIs button on the top right link, and select Framework to open the swagger interface.

2a. Generate App Id and App Secret

(steps valid for the current UI)

Before consuming the APIs, you need to create an External Application in your Automation Cloud account:

  1. Within your Automation Cloud account, access Admin in the left navigation
  2. Select External Applications
  3. Click Add Application.
  4. Application name = Name however you’d like (e.g. - “du”)
  5. Click Add Scopes and you’ll have see a “Edit Resource” menu expand from the right
    1. Select Document Understanding from the Resource drop-down
    2. Click the “Application Scope(s)” tab and select all checkboxes
    3. Click the Save button
  6. Leave the Redirect URL blank
  7. Click the Add button
  8. A pop-up will show, copy the App ID and AppSecret
  9. These 2 will be used to authenticate into swagger.

2b. Authorize your Document Understanding app

  1. Return back to the Swagger page you opened in step 1
  2. Click the Authorize button
  3. In the pop-up, provide your App ID and AppSecret and click the Authorize button

3. Consume

Once authorized, you are ready to consume! :rocket:

We propose the following flow, however, you have the flexibility of implementing your own.

Limitations :warning:

  1. Single Document Type per file - multiple Document Types & splitting capabilities to be added.
  2. Business Rules: currently, we do not provide you the possibility to define the Business Rules on a Document Type defined in Document Understanding center - this is something we currently work on.
  3. When discovering resources, some information you see in Document Understanding in Automation Cloud, may not be available yet - we work on adding it and have parity between the 2.
  4. Training - as of now, we do not automatically submit the data from the Classification or Validation Station for training - it is in our backlog, we plan on working on it soon.
  5. Document Data availability - we will retain the Document Data for 7 days after submitting the digitization request - afterwards, the data corresponding to the documentId will be removed and one will not be able to use it in further operations (another digitization request will be required to do so).

Charging

Charging will happen based on AI Units, as described here, considering the consumption of the respective models for extraction and classification (e.g. extracting information from a 3-page document will result in the consumption of 3 AI Units), with restrictions applicable to the respective license.

Do reach out if you give our APIs a try and let us know how it’s going! What are we missing? What would you like to see further? Looking forward to your thoughts! :dancer:t2:

10 Likes

It’s great news.

I think a cool feature to include is the possibility to enable allocation DU AI units by project generating another DU key. This would will help a lot to manage consumption and governance in DU projects.

Very good point @rikulsilva, I will add your feedback to our backlog! :slight_smile:

When we follow all these steps everything works until when we present extracted results to the validation action center , the items in the table for a custom invoice are not showing , even when we use the predefined project … anyone had this before ?

@mmuleya can you maybe share with us some docs to reproduce the issue you’re facing?

So what I did wat at first I used AI centre to do the training and when I use normal framework on UiPath studio it works and I can see the table on the validation station , but when I trigger validation station using API , and the user asigns the task to themselves , they are not able to see the table it needs them to re label the table all over again and the key strokes for marking alien for example making a selection and pressing enter to indicate a row , that is now working . Kindly assist if you have an idea

@mmuleya I see, this was a bug on our side - we’re fixing it and planning a release of it next week, thanks for reporting!