Document Understanding: Document Splitting and Other Wonderful Stories :)

Ioana_Gligan · June 22, 2020, 12:51pm

UiPath’s Document Understanding now has support for file splitting, custom ML models, better digitization and more!

The Intelligent OCR package (4.7.0-preview version) is out, and is ready to help you in even more complex use cases.

The Heros of this new version are a few new activities that allow you to work with files that contain multiple documents within them.

Intelligent Keyword Classifier

The Intelligent Keyword Classifier (and its companion, the Intelligent Keyword Classifier Trainer) are here to help you classify and split documents: if you now need to process a file that contains multiple documents inside, you can give the Intelligent Keyword Classifier a try!

How it works:

All you need to do is add it in a Classify Document Scope activity. Like this:

Don’t forget to Configure your Classifier for the doc types you are targeting for classification, like this:

It has a companion, the Intelligent Keyword Classifier Trainer (that goes into the Train Classifiers Scope) - this activity is used to help the Intelligent Keyword Classifier get better with each file you process!

But Don’t Panic, as the Hitchhiker’s Guide to the Galaxy recommends. You don’t actually need to run “the real deal” to get it trained. You can do this at design time as well, using the Manage Learning wizard. Like this:

That is, click on

Start Training (or the

Edit icon for a doc type that already has training), select a few files that contain single samples of that document type (e.g., 3 documents each containing a single document of type X, not a document containing 3 samples of type X), and let it do it’s job. You will notice that the word vectors start appearing.

This would not be too useful by itself, so we’re also publishing the …

Present Classification Station

With it, the Document Understanding Framework gets another (that’s cool, yeah) feature. It is an attended activity that allows humans to review and correct automatic classification, split files into multiple document types, all in an awesome and very simple user interface. Like this:

How it Works:

you can view the document and scroll through it on the right side
you can view and edit page range splits and associated classes on the left side
you can move pages to adjacent classes by drag and dropping them
you can split a range of pages by clicking on the split option between any two pages
you can merge two ranges by using the “Move all Pages Up/Down” options (under the three dots, like this:
you can scroll to any page by clicking on it
you can scroll to any document type by clicking on it.

Mind you, this is an OPTIONAL step: the output of the Classify Document Scope and the output of the Classification Station are of the same type. If you do want 100% accuracy though, we recommend you use it.

One associated change that you might want to implement in your processes is that related to handling ALL the doc types found within a file. For this, after performing a classification with splitting (using the Intelligent Keyword Classifier), after a human confirms the classifications, you can move forward with much more confidence into the Data Extraction phase… in a FOR EACH loop (now you don’t only have one class, you potentially have multiple classes, right?)

You don’t have to worry about anything, as all Extractors will only get the page range they should be performing extraction on.

Machine Learning Extractor and AI Fabric - a story

The Machine Learning Extractor (from the UiPath.DocumentUnderstanding.ML.Activities pack) got a new configuration option, if you want to use it with an AI Fabric ML Skill. Like this:

The ML Skill dropdown will be populated with your Document Understanding Skills, if your robot is connected to a Cloud Orchestrator that has AI Fabric enabled (and, of course, has Document Understanding ML Skills).

AI Fabric is now in GA , and you can use it as infrastructure for managing your document understanding models. Available in our Cloud platform for enterprise accounts, AI Fabric can configure, train, host and serve Machine Learning models for Document Understanding. You can choose to start from our pre-trained Invoices or Receipts model, or with a blank-slate DU model that you can train (using data tagged using DataManager) on any fields of interest to your use case.

OCR Enhancements

Starting with this release, you can use the Microsoft Azure Computer Vision OCR and the Google Cloud Vision OCR as engines for design-time training and template setup in Document Understanding. Like this:

image677×527 18 KB
Google Cloud Vision OCR now has another Input Argument, called DetectionMode . This is by default set to “TextDetection” (the current implementation), but you might want to try it on the “DocumentTextDetection” mode as well. In some use cases, and for specific languages, one or the other of these two options might perform better. The setting is… Like this:
In case you missed it, we are working on our own Document OCR engine, which comes with a companion activity in the UiPath.OCR.Activities package, in community preview.

Like this?

(see what I did there? )
Please don’t forget to send us your feedback so we can improve these preview features and make them shine in your workflows!

The Document Understanding Team

atoi · June 22, 2020, 7:57pm

So far, your work helped us so much. Even more fascinating features… thank you!

alexologica · June 23, 2020, 3:26am

Hi @Ioana_Gligan,

I couldn’t find Google Cloud Vision OCR Detection Mode in UiPath.OCR.Activities v2.1.0 & UiPath.IntelligentOCR 4.7.0

Can you check this? Thanks

Ioana_Gligan · June 23, 2020, 7:22am

Hello @alexologica,

Please upgrade the UiAutomation package to the latest preview version.

ab83665 · June 23, 2020, 5:34pm

Hi @Ioana_Gligan,

as always great features and great update.

I wanted to ask whether there is a list of languages that are being handled by pre-trained Invoices models, that are being availible for us to use. I am working on my master thesist regarding Invoice classification and extraction in Polish and I wondered whether the models now also work with Polish language.

Thank you for your help,
Andrzej

oscar · June 24, 2020, 2:29am

Wow! @Ioana_Gligan and team! Great update. I am looking forward to trying this out. I love the addition of being able to use custom OCR activities.

Do we need to have a Document Understanding API key in order to use the Intelligent Keyword Classifier? Or can we keep that on our local PC without an API key?

Ioana_Gligan · June 24, 2020, 5:37am

Hello @oscar,

You need to use the cloud DU key , for tracking purposes only. No documents leave your premises, and no data about your processes.

Lahiru.Fernando · June 25, 2020, 1:35am

Wow… awesome features!!!

I already tried out some of these lateat features and looks amazing… Exploring more and for sure will share the feedback… I always wanted to see more features in the DU package… and this is exactly what I wanted to see… Awesome work guys…

You guys are like magicians

Ioana_Gligan · June 25, 2020, 12:30pm

Hello @ab83665 (Andrzej) ,

and Welcome to the Forum!

Polish is not on the list of supported languages AFAIK. It would be great if you would actually try it out and see if it gives any results…

A custom trained model would probably work best in your case…

Ioana

ab83665 · June 25, 2020, 5:32pm

Thank you for answering, @Ioana_Gligan

I do agree that custom trained model might be my best best, as unfortunately invoces are not structured enough for Regex extraction. However, is it possible to use own models in Studio UiPath version?

Still I would like to try testing the pre-trained models first - is there any documentation regarding them availible to see what possible fields are there to extract or perhaps how the model was even built?

Thank you for your answer,
Andrzej

Mariele_Oosting · June 26, 2020, 8:07pm

Hello, I’m a total beginner and want to test whether I’m able to use OCR for extracting data from orders. Now I wonder which OCR “program” I should use for this purpose. I’ve seen a video where apparently they used Google OCR. But I cannot find Google OCR in Studio nor in the “packages” for free download. I therefore installed UI Path OCR (Document and Screen) and was then told I need an ApiKey. I copied it but got the message “compiler fault” and in German: “Ausdrucksende erwartet”. No idea what that means. So, I resume that I cannot use this program. What should I do now to find an easy accessible easy to use free OCR program to test my abilties?

sahilwadhwa100 · June 28, 2020, 6:15pm

You can use Microsoft OCR or Tesserat OCR

You will use UI Automation package : 20.4.2 version

Mariele_Oosting · June 28, 2020, 8:31pm

Thank you! I found it.

Ioana_Gligan · June 29, 2020, 2:51pm

Hello Friends,

If you want to see this in action, let’s meet online tomorrow!

(apologies for the last minute announcement )

Ioana

Ioana_Gligan · June 29, 2020, 2:54pm

@ab83665,

All you need to do is just use the public pre-trained model with the endpoint https://invoices.uipath.com - with a Document Understanding ApiKey from the Cloud platform. And you can use the ML Extractor - it is open for community as long as you process at most 2 pages per document and at most 50 documents per hour.

Hope this helps,

Ioana

shetanshudhar · July 7, 2020, 7:06am

Hi @Ioana_Gligan! Why is there no option of UiPath Screen OCR in form extractors? UiPath screen OCR works really well on images, native or even scanned…
Do we have only receipts and invoices in ml extractor for document understanding? How do we handle other documents using ML extractor?

davendra · July 8, 2020, 12:45pm

Hi Ioana,

Is there a cost associated with using the Intelligent Keyword Classifier?

Thanks
Davendra

RISHI4897 · July 10, 2020, 12:28am

Hello Everyone,

I am new to document understanding and trying to understand the framework. I developed one XAML file to extract invoice# from amazon invoice using regex-based extractor but for some reason, it is not extracting even after trying multiple times.

Could anyone of you look at the attached XAML file and suggest any solution. The file should already have a sample invoice.

Thanks ,
Rishi

Document_Understanding.7z (82.4 KB)

tudor.serban · July 10, 2020, 11:29am

Hi @shetanshudhar: You can use UiPath Document OCR with Form Extractor. UiPath Screen OCR is meant to be used for Screen Scraping tasks.

Ioana_Gligan · July 10, 2020, 11:50am

Hello @davendra,

No cost associated with it as of now. ApiKey checks are being performed though, and some limitations might be enforced for community keys only. The definitive structure in which it will be officially published (now it is in preview) will be finalized in a couple of months.

ioana

Topic		Replies	Views
How to Split PDF Documents Based on Intelligent Keyword Classification Output in UiPath Document Understanding? Document Understanding activities , studio , question , document_understanding	3	20	May 20, 2025
Document Understanding: ML Classification Splitting Document Documentation studio , question	0	1344	January 15, 2022
How to split a file into individual document types in Document Understanding? Activities activities , question , document_understanding , classifier , intelligent-keyword-classifier	5	726	February 17, 2024
Extract and create different documents from one document Document Understanding activities , question	1	465	July 28, 2023
How to use Splitting option from the intelligent keyword classifier activity Activities activities , question , document_understanding	55	1398	August 29, 2023

Document Understanding: Document Splitting and Other Wonderful Stories :)

Intelligent Keyword Classifier

How it works:

Present Classification Station

How it Works:

Machine Learning Extractor and AI Fabric - a story

OCR Enhancements

Like this?

Related topics