Document Understanding 2021.4 - or Why We've Been so Quiet

Dear Community,

Here we are, another few months have passed… and we haven’t written you any news in such a long time… But I believe you’ll excuse us for not making small talk (:corn:, :camping:, :bike:), while waiting to be able to make some pretty big announcements (:popcorn:, :cityscape:, :rocket:)! (however nice small talk may be :slight_smile: )

So hoping this letter finds you well, let’s pick up from where we left off in February, when we announced some really nice little features:

  • :earth_asia: New Geo-Localization of the Document Understanding Services
  • :1234: Data Manager – Relaxed Limits for Field Definitions
  • :white_check_mark: UiPath Document OCR – Checkbox Detection Capabilities
  • :ok_hand: :page_facing_up: Document Validation – Usability Enhancements, with
    • :abcd: Text Version Improvements, and
    • :eye: Extracted Value Display Improvements.

We strongly recommend you try them out - they will definitely bring a breath of fresh air :wind_face: to your DU processes.

And then, keep going for the new stuff :smiley:

:anchor: :compass: Field Level Anchors in Form Extraction

Until now, the Form Extractor and Intelligent Form Extractor activities could be used for processing fixed format documents with no variation: the RPA developer would define templates based on page layouts, :bookmark_tabs:, and data would always be identified :mag_right: based on its relative position against the page template definition.

But we never stop building :building_construction: stuff, so we’re now ready to show you some pretty cool new features related to form extraction:

:fireworks: The latest version of the Document Understanding package UiPath.IntelligentOCR.Activities now allows for more flexibility when defining your form extraction rules, by supporting Field Level Anchor Settings.

What this means, is that you can now enter your Template Editor (for an existing template or by creating a new one), and you’ll have some new options available:

To create a new Anchor Setting:

  1. :anchor: Make sure you are on the Anchor Selection mode (above the document view)
  2. :blue_square: Draw a box around the value area.
    • A Blue box will appear as a Target for your anchor setting.
    • Drawing a box means Click → Drag → Release.
      • in the example, it contains the area under the Employer Identification Number label.
  3. :green_square: Select a Label (main anchor) for your value area.
    • A Dark Green box will appear around the Words (Tokens) in your selected Label.
    • Selecting a Label means either Click on first word → Ctrl+Click on last word of selection, OR Click → Drag → Release to capture a word range.
    • :exclamation: A Label can only contain words from the same visual line.
      • in the example, it contains the “Employer identification number” selection
  4. :green_circle: Select any additional anchors that would uniquely identify your Label.
    • A Lighter Green box will appear around the Words of your additional anchors.
    • Selection is done just like for the Label
      • in the example, there are two additional anchors: the “Taxpayer Identification Number (TIN)” selection, and the “Certification” selection.
  5. :arrow_forward: Assign your anchor construct to the appropriate field (just like you’d assign a regular value.

To edit an Existing Anchor Setting:

  1. Highlight your anchor
  2. Make changes to it (delete any anchors, the label, even the value area if you wish, and then make any changes like above)
  3. Use the Replace Value option to update your field association.

You can mix and match behavior for the same document type :cocktail: :wine_glass: :tropical_drink:

You can define as many templates as you want for the same document type. You can have multiple page level templates, multiple anchors for the same field, even a template containing both page level as well as field level anchors!

The :mage: magic part is that at run-time, the Form Extractor and Intelligent Form Extractor know how to:

  • identify if a page level template matches, and extract information according to the best page level template match it identifies;
  • identify if any anchor based settings match, and extract information according to their application in the document to be processed;
  • compute appropriate confidence scores for all possible matches, so as to be able to report the best result (highest probability match) of all options available!

By the way, if for a single-value field you do get more than one potential result - you’ll get the highest confidence one reported and shown as the value, but in Validation Station you’ll also see the other matches as suggestions, to help you in choosing a different one if the best match is not your desired output.

Basic Tips and Tricks :drum:

  • when defining field level anchors
    • make sure your Label is close to your value area,
    • make sure your Label is supported by additional anchors if the same text construct can be found in multiple places within the same document
  • remember that there is a fuzzy matching algorithm in place to adjust to potential OCR errors. This means “here” and “hare” are pretty close (Levenshtein-wise, right?). So
    • the longer your labels and anchors, and the more precision you’ll get
  • the value area is always computed based on its relative position against your Label (main anchor)
    • this might influence the way you choose your main anchors.
  • using field level anchors will give you more flexibility in document layout changes
    • having field level anchors will allow fields to move within the template (but not with respect to the anchor selections), and still be captured…

Feedback is a Product’s Best Friend, just like :ring:'s are a :girl:'s …

We are so thankful for all the feedback we got so far! Do keep it coming and we’ll do our best to build the product you need!

:brain: :white_check_mark: Machine Learning Extractor Now Knows Checkboxes

Yes, we gave our Document Understanding ML Models a special pair of glasses :dark_sunglasses: - the cool :ice_cube: kind - that allow them to learn from / and interpret / checkboxes, as found by our UiPath Document OCR engine in the digitization phase. Yes, you guessed right, we are building on top of the UiPath Document OCR – Checkbox Detection Capabilities we have released a bit earlier - and not only!

However, there is an important difference (:guitar: vs :violin:) between the way checkboxes are handled in the Form/Intelligent Form Extractor, and the recommended way to handle them in ML Extractor. In Data Manager, when labelling documents, you would not train the model to recognize and extract the checkboxes themselves, but rather to recognize and extract the words/option names which are next to the checked checkboxes.

Here’s how labeling such a field would look like:

As you can see, in the above W-9 form example, you won’t be capturing the checkbox next to the “Individual/Sole Proprietor” option in the above document, but you would capture that THAT particular option is checked, and not some other (like “C Corporation” or “Partnership”).

:bangbang: The point of the checkbox is to draw attention to a word. Consequently, to train an ML model, you would label the word, not the checkbox itself. In many cases checkboxes may just be an X :x: character or a Check :heavy_check_mark: character rather than an actual Checkbox :ballot_box_with_check: character; sometimes a :writing_hand: handwritten mark which is not even OCR-able could be used to mark a human selection on a form. The ML model can learn to associate all of these situations with the word next to the mark and thus report the correct value.

Check :white_check_mark: it out and let us know how this feature works for you. We :heart: Feedback.

:books: :bookmark: New Docs Space for Document Understanding

Yaaay! All information in a single docs space - a dedicated place for all Document Understanding related knowledge! :tada:

Make sure you check it out new UiPath Document Understanding Guide, as DA place to go to for DU related information.

Feedback time!

Once you check out the new features, please take some time to leave us a note on how we did… Feedback is driving us forward and helping us make sure we are building the right thing for you!

Yours Truly, :heart:

The Document Understanding Team

23 Likes

Detecting checkbox is a great initiative .:heart: this :slight_smile:

2 Likes

@Ioana_Gligan it’s great we are improving it.
I have a thought that unfortunately, we do not have much clear understanding of existing uipath documentation due to lack of tutorials and examples. You can check these links.

3- What type of dataset should be added with document understanding machine learning model to train it?

It is requested to have a look on this site as well.
P.S if you have any solution of it do let me know.

Happy Automation :slight_smile:

@nashrahkhan ,

Thank you for the feedback!

I responded to the first two questions on their respective threads, hope it will help!

We have a new docs space, only for Document Understanding, that you can find here: UiPath Document Understanding - we are now working on putting more and more documentation in there, so it will be easier to start with, configure, and use the Document Understanding components.

Please send us more feedback if some of the information you find in there is incomplete - or some parts are missing…

Thank you again!

Ioana

1 Like

Hi @Ioana_Gligan

I have noted that the “proxy.config” file has been removed from v2021.4, not too sure if it is installation error or intentional. If it is intentional, can you shared how should we configure proxy settings moving forward?

Hi @Lim_Leng,
This is nothing about the Document Understanding but about the Robot proxy configuration. It has changed indeed. You can find an actual information here:

3 Likes

Hi @Pablito
Ah. Thanks for pointing that out. I should have post this question in https://forum.uipath.com/t/uipath-community-2021-4-stable-release/302093
I will go over there now.

1 Like

Thank you @Ioana_Gligan for this post, it is always great to have updates and new features :gem: :gem: :gem:

2 Likes

I can’t install IntelligentOCR.Activity.
Can you help to solve the problem?

Hi @pitaty

Could you share your error message?

Hi @pitaty

Could you please check your firewall software? :slight_smile:
It might be the cause of the issue:

How to check? I can download other packages, but not this one

Does it happen for all versions of this package, or are you maybe able to download an older version?

Is it your own machine or is it maybe your company one? If the second, I would check with your local IT department.

HI, this is a fantastic update. My previous DU processes always uses the framework from the Academy training. I basically copied the solution from the final practice as my starting point and updated the workflows to my needs (e.g. passing the consumption of documents to an orchestrator queue). How do you suggest that I quickstart my new DU Project? can I still use the ‘template’ from the academy practice excercise and work through each block to incororate the updates? or will it be better to start a new project? thanks and keep it up

I can see where these updates will make the tool easier to work with and more flexible. I was on calls with your team about a year ago when we were implementing. You saw what we were struggling with, understood how to make the product better and followed through. Kudos!

Hi UiPath,

It is possible to define the minimal setup for Taxonomy like the document type only is receipt. And using unsupervised machine learning to feed all the sample receipts that automatically pickup all the document type details [fields and field types]. The user then go back to Taxonomy Manager to select from a list of document types [fields and field types] and correct/update/add the field or field types.