Extraction Automation Builder with predefined Document Types

Overview

Setting up a Document Understanding workflow can be overwhelming and complex - with Studio Web, things got easier, not only because of the user friendly design of it, but also due to activities which are simpler - doing more magic in the background for you.

Still, there is some setup required for processing documents: what extractor should be used? Is validation required? Show me the extracted results! To facilitate this and provide a smooth, easy onboarding experience which allows you to get up & running quickly, we provide the Extraction Automation Builder in Document Understanding.


The Extraction Automation Builder allows your to create a Studio Web Workflow based on one document , which is classified, uploaded in the orchestrator and then used to extract the data from it with a pre-configured extractor.

How to Get Started

  1. Go to “Document Understanding” and choose the Get started within the “Extraction Automation Builder”
  2. Upload the document you want to work with (make sure it respects the supported file type as described in the UI). If the Document Type has not correctly been identified, you can modify it in the dropdown below - else, leave it as is. Similarly, the optional configuration, where you can provide some more details with regards to the workflow.
  3. Click on Create workflow - the button will create a Studio Web workflow using Document Understanding activities configured for processing the uploaded sample document.
  4. You will be redirected in Studio Web where you can either run or publish the workflow as is - or configure it further based on your use case. The possibilities are endless! :slightly_smiling_face:

Limitations :warning:

  1. Only predefined Document Types supported - support for custom Document Types & Extractors to be added.
  2. No digitization settings available - this is something we plan on adding.
  3. Single Document Type per file - if you do need multiple Document Types & splitting capabilities, let us know :slightly_smiling_face:
6 Likes

hi @Monica_Secelean , how can I use this to iterate through multiple invoices. While using the for each seems the validate document data does not work.
limitation_extractor

Hey @Vishal_Kalra

We are looking at removing this restriction for the For Each loop in the upcoming version of the System activity package 23.10.

But even then, I would invite you to check out some content made by our community around processing files with Document Understanding. Simple looping over the items is simple, but is not the most optimal way to process a lot of files :slight_smile:

Please see here for more context: