Here we are, another few months have passed… and we haven’t written you any news in such a long time… But I believe you’ll excuse us for not making small talk (, , ), while waiting to be able to make some pretty big announcements (, , )! (however nice small talk may be )
So hoping this letter finds you well, let’s pick up from where we left off in February, when we announced some really nice little features:
- New Geo-Localization of the Document Understanding Services
- Data Manager – Relaxed Limits for Field Definitions
- UiPath Document OCR – Checkbox Detection Capabilities
Document Validation – Usability Enhancements, with
- Text Version Improvements, and
- Extracted Value Display Improvements.
We strongly recommend you try them out - they will definitely bring a breath of fresh air to your DU processes.
And then, keep going for the new stuff
Until now, the Form Extractor and Intelligent Form Extractor activities could be used for processing fixed format documents with no variation: the RPA developer would define templates based on page layouts, , and data would always be identified based on its relative position against the page template definition.
But we never stop building stuff, so we’re now ready to show you some pretty cool new features related to form extraction:
The latest version of the Document Understanding package UiPath.IntelligentOCR.Activities now allows for more flexibility when defining your form extraction rules, by supporting Field Level Anchor Settings.
What this means, is that you can now enter your Template Editor (for an existing template or by creating a new one), and you’ll have some new options available:
- Make sure you are on the Anchor Selection mode (above the document view)
Draw a box around the value area.
- A Blue box will appear as a Target for your anchor setting.
- Drawing a box means Click → Drag → Release.
- in the example, it contains the area under the Employer Identification Number label.
Select a Label (main anchor) for your value area.
- A Dark Green box will appear around the Words (Tokens) in your selected Label.
- Selecting a Label means either Click on first word → Ctrl+Click on last word of selection, OR Click → Drag → Release to capture a word range.
A Label can only contain words from the same visual line.
- in the example, it contains the “Employer identification number” selection
Select any additional anchors that would uniquely identify your Label.
- A Lighter Green box will appear around the Words of your additional anchors.
- Selection is done just like for the Label
- in the example, there are two additional anchors: the “Taxpayer Identification Number (TIN)” selection, and the “Certification” selection.
- Assign your anchor construct to the appropriate field (just like you’d assign a regular value.
- Highlight your anchor
- Make changes to it (delete any anchors, the label, even the value area if you wish, and then make any changes like above)
- Use the Replace Value option to update your field association.
You can define as many templates as you want for the same document type. You can have multiple page level templates, multiple anchors for the same field, even a template containing both page level as well as field level anchors!
The magic part is that at run-time, the Form Extractor and Intelligent Form Extractor know how to:
- identify if a page level template matches, and extract information according to the best page level template match it identifies;
- identify if any anchor based settings match, and extract information according to their application in the document to be processed;
- compute appropriate confidence scores for all possible matches, so as to be able to report the best result (highest probability match) of all options available!
By the way, if for a single-value field you do get more than one potential result - you’ll get the highest confidence one reported and shown as the value, but in Validation Station you’ll also see the other matches as suggestions, to help you in choosing a different one if the best match is not your desired output.
- when defining field level anchors
- make sure your Label is close to your value area,
- make sure your Label is supported by additional anchors if the same text construct can be found in multiple places within the same document
- remember that there is a fuzzy matching algorithm in place to adjust to potential OCR errors. This means “here” and “hare” are pretty close (Levenshtein-wise, right?). So
- the longer your labels and anchors, and the more precision you’ll get
- the value area is always computed based on its relative position against your Label (main anchor)
- this might influence the way you choose your main anchors.
- using field level anchors will give you more flexibility in document layout changes
- having field level anchors will allow fields to move within the template (but not with respect to the anchor selections), and still be captured…
We are so thankful for all the feedback we got so far! Do keep it coming and we’ll do our best to build the product you need!
Yes, we gave our Document Understanding ML Models a special pair of glasses - the cool kind - that allow them to learn from / and interpret / checkboxes, as found by our UiPath Document OCR engine in the digitization phase. Yes, you guessed right, we are building on top of the UiPath Document OCR – Checkbox Detection Capabilities we have released a bit earlier - and not only!
However, there is an important difference ( vs ) between the way checkboxes are handled in the Form/Intelligent Form Extractor, and the recommended way to handle them in ML Extractor. In Data Manager, when labelling documents, you would not train the model to recognize and extract the checkboxes themselves, but rather to recognize and extract the words/option names which are next to the checked checkboxes.
Here’s how labeling such a field would look like:
As you can see, in the above W-9 form example, you won’t be capturing the checkbox next to the “Individual/Sole Proprietor” option in the above document, but you would capture that THAT particular option is checked, and not some other (like “C Corporation” or “Partnership”).
The point of the checkbox is to draw attention to a word. Consequently, to train an ML model, you would label the word, not the checkbox itself. In many cases checkboxes may just be an X character or a Check character rather than an actual Checkbox character; sometimes a handwritten mark which is not even OCR-able could be used to mark a human selection on a form. The ML model can learn to associate all of these situations with the word next to the mark and thus report the correct value.
Check it out and let us know how this feature works for you. We Feedback.
Yaaay! All information in a single docs space - a dedicated place for all Document Understanding related knowledge!
Make sure you check it out new UiPath Document Understanding Guide, as DA place to go to for DU related information.
Once you check out the new features, please take some time to leave us a note on how we did… Feedback is driving us forward and helping us make sure we are building the right thing for you!
The Document Understanding Team