RPA Framework for Document Understanding

Alexandru-Luca · December 22, 2020, 9:23am

Hello everyone!

Our new RPA Framework for Document Understanding processes is now available for preview and review.

Key features:

Easy to get new Document Understanding projects started; usable in all cases - from small processes to complex solutions
Easy to integrate into larger automation flows
Production-ready; built-in logging, exception handling and retry mechanisms
Common architecture for both Attended and Unattended (+ Action Center) implementations
Meant to make development, deployment, debugging and scaling much easier.

I’ve created this 15-min video to give an overview of the solution and help anyone interested to get started.

We encourage you to try it out and let us know what you think. We love getting your feedback - it is key for improving our solutions.

The solution is available to be downloaded as a ZIP archive or a nuget package usable as a studio template.

mmcruzRPA · December 22, 2020, 2:16pm

Very good base without doubt for processes with Document Understanding, good job! :))

zell12 · December 23, 2020, 3:42am

Awesome one @Alexandru-Luca.

Question: Even though you have a multipage document, containing several document types within - the explicit ‘splitting’ of the document into separate files using extract pdf range, is still optional, right? Because you can still extract each classification result separately without explicitly splitting into multiple files. So that you wont have to perform redigitization.

ashwin.ashok · December 23, 2020, 6:00am

This is really great news!
Keep up the good work!

T0Bi · December 23, 2020, 9:00am

@zell12 exactly.

In fact, for this workflow it even might be useful to combine several PDFs into one.

Imagin a scenario where you have to extract data out of email attachments. Lots of times there’s not only the attachment you need, but also others that are unnecessary. In order to know which one is useful and which isn’t you need to digitize and analyse them.

As this framework is using one job per input file, the easiest solution is to combine your attachements into a single PDF in the dispatcher (dispatcher reads mail, combines attachments, starts DU Job).

The framework then does the whole job of finding out which documents (pages) are useful (classification) and extract the needed data.

@Alexandru-Luca

very good job! I really like it!

seanrockvz13 · December 23, 2020, 4:05pm

Excellent Work @Alexandru-Luca!! Thanks to you and the whole UiPath team!

prasath17 · December 23, 2020, 6:16pm

Hi …@Alexandru-Luca - I am not able to add the nupkg …am i missing something?

mmcruzRPA · December 24, 2020, 11:24am

Hi @prasath17
You should add the npkg in the folder that your setup is searching for templates.

Where the npkg should be saved
Click on more templates

Here it is!

copy_writes · December 26, 2020, 7:11am

wow

wagner · December 26, 2020, 12:06pm

Does it works with the current REFramework ?

There are teams which works with only REFramework.

jamesjacobsydney · December 27, 2020, 8:23am

Thank you @Alexandru-Luca This is really great

Alexandru-Luca · December 28, 2020, 2:30pm

Thank you, everyone, for your feedback! It is very much appreciated, please keep it coming!

@zell12 - indeed, document splitting is optional. In fact, most processing steps are optional and should only be used if needed. As for the splitting - it’s more of a UX improvement; a bit of a workaround to allow the person doing the data validation to only view the page-range they should be checking. It should make for a better user-experience and make validation a bit less prone to user-error. This optional step will go away once page-range support is implemented in the Validation Station/Validation Action.

@mmcruzRPA Thank you for explaining how to set up the template

@wagner - the new framework is meant to be used only for implementing Document Understanding processes (and only such processes). It should be used in conjunction with the RE-Framework, for any automation process coming before or after the Document Understanding part.

ctutu · January 21, 2021, 4:15pm

Hey Alex,

Can you please offer a practical example on how DU framework would be used in conjunction with queues?

As quick example Dispatcher puts files in Q1, Q1 would have a trigger to start DU for each item uploaded by the Dispatcher - when DU finished it creates a Q item in Q2, waiting for the performer to pick it up.

Upsides for using queues are too many to mention … downside is that the Q1 items would be abandoned if not validated by HiL in 24hrs (if that’s the case), any workaround for this? considering I want queues end to end.

Alexandru-Luca · January 22, 2021, 10:34am

Hi @ctutu!

Agreed, using queues definitely has huge advantages and we all prefer to use that approach

The bad news: the only workaround at the moment, would be to change an Orchestrator config setting that dictates how much time passes before a queue item becomes abandoned. We recommend not to use this approach however: this setting affects all queues and the potential for running into undesirable problems is quite high. Or you can allow queue items to become abandoned for a while but this is also undesirable.

The good news: the Orchestrator product team is working on bringing long-running support to the queues. When this feature is introduced, the framework will be updated to support queues, as well.

Cheers,
Alex.

Chen_Kenny · January 25, 2021, 3:27am

Could you help the English Subtitle ?
Please ~~~

Alexandru-Luca · January 26, 2021, 1:54pm

Hi @Chen_Kenny!

Unfortunately I don’t have any subtitles prepared But please feel free to download the video itself and use an app that auto-generates captions.

_maan · January 29, 2021, 6:07am

Hi @Alexandru-Luca,

What do you mean by Robot license as unattended robot.
Has both access in Modern Folder, etc.? I’ve tried creating the Dispatcher as show on your tutorial but Ive getting errors on the ActionCenter part - Always job faulted. Can you show us the set-up on your ochestrator?

Alexandru-Luca · January 29, 2021, 11:08am

Hi @_maan!

The most common issue is caused by the robot not having proper access rights to the storage bucket or to the stored filed.

Can you please share what error(s) you are facing? A screenshot and a stack trace would be most helpful.

An FYI for everyone: an update has been published for the template - v1.0.2. The links in the initial announcement message point to that version now.

Topic		Replies	Views
Document Understanding Process - New Studio Template Product News	45	9650	November 11, 2024
Document Understanding: Document Splitting and Other Wonderful Stories :) Document Understanding	65	11459	January 15, 2022
UiPath Document Understanding - The Document Understanding Process Template in Studio - Additional exercise Document Understanding document_understanding , du-process-template	7	742	December 23, 2024
Document Processing 20.4 Beta: Human-Robot Interaction using Action Center Product News news	86	13734	November 16, 2021
Document Understanding GenAI Activities- Not able to use Document Understanding activities , feedback	3	65	March 9, 2025

RPA Framework for Document Understanding

Related topics