Document Understanding Best Practises / REFramwork

T0Bi · October 8, 2020, 9:16am

Dear Community,

I’m currently trying to build a template for the whole document understanding process / life cycle and came across a few problems regarding best practises when combining document understanding with the REFramework.

This is the basic process of document understanding according to UiPath:

In my case, I’ve added a 3.5 as well, which is another validation station as in step 5, for documents where the classifier is below a certain confidence level.

So my process basically looks like this:

My big question is, how to split up the process in order to maximize parallelism. I cannot wait with the process until a user has finished their classification or extraction tasks because I might have to extract data from hundreds or thousands of documents.

So I was thinking about using queues between a lot of the steps and having several different bots (or rather processes).

Example: Once the document is classified, the classification results get put into a queue for further processing. This happens when either the bot or a person does the classification. Same for the data extraction, results get put into a queue for further processing. I’ll probably also need queues for retraining the models.

This would mean I’d have to split my process into 3 different parts connected by queues: Digitizing + Classification, Data Extraction and further processing of the extracted data.

Each of those parts would then use the REFramework.

Another idea would be to create an Orchestration Process and run this for every single file which needs to be processed. This is definitely easier to implement but there might be a time with hundreds or more processes waiting for user input and I’m not sure how this is handled.

The first approach is definitely scaleable with simply adding more robots, not sure about the second way.

I’d be great if we could discuss those approaches or you could even share your own way of tackling the whole document understanding process.

T0Bi

system · October 10, 2020, 4:00pm

Hello @T0Bi!

It seems that you have trouble getting an answer to your question in the first 24 hours.
Let us give you a few hints and helpful links.

First, make sure you browsed through our Forum FAQ Beginner’s Guide. It will teach you what should be included in your topic.

You can check out some of our resources directly, see below:

Always search first. It is the best way to quickly find your answer. Check out the icon for that.
Clicking the options button will let you set more specific topic search filters, i.e. only the ones with a solution.
Topic that contains most common solutions with example project files can be found here.
Read our official documentation where you can find a lot of information and instructions about each of our products:
Watch the videos on our official YouTube channel for more visual tutorials.
Meet us and our users on our Community Slack and ask your question there.

Hopefully this will let you easily find the solution/information you need. Once you have it, we would be happy if you could share your findings here and mark it as a solution. This will help other users find it in the future.

Thank you for helping us build our UiPath Community!

Cheers from your friendly
Forum_Staff

Alexandru-Luca · October 12, 2020, 4:35pm

Hi @T0Bi!

Your observations are absolutely spot on!

We are currently exploring both approaches you mentioned:

Splitting the processing flow into smaller sub-processes that pass data to each other through Queues. Orchestration Processes will do just fine, you don’t necessarily need the REF as long as you take care of exceptions & retry mechanisms.
Using an Orchestration Process to process an input file end-to-end

In the first scenario, make sure you don’t leave any queue items as In Progress when a suspension point is reached. Otherwise, if the action is not completed within 24h, the item would get Abandoned.

In the second scenario, a dispatcher process simply starts a job for every input file. Scaling it is as simple as adding more robots to the “processing pool” (environment or modern folder, depending on the case).

Just for awareness, without any promises on availability: we are working on an out-of-the box Studio RPA template that would implement the logging, error handling & retry mechanisms specific for Document Understanding processes.

Cheers,
Alex.

T0Bi · October 13, 2020, 9:51am

Hi @Alexandru-Luca

thanks for your answer!

I haven’t thought about leaving queue items as In Progress will lead to them getting “Abandoned” after a while, that’s a really good point.

For now I’ll stick with a dispatcher and an Orchestration Process, it seems like the easiest way.

I’m looking forward to seeing such a template, I think it would make the Document Understanding process faster and easier to implement.

Cheers,
T0Bi

David_Hampton · May 2, 2021, 11:19am

Hi @T0Bi,
Are there any updates on your progress or UiPath’s progress on combining the REFramework with Document Understanding best practices?

Thanks!
David

wasea · May 2, 2021, 11:44am

Hi @David_Hampton,

Maybe this post can help you:

Vasile.

T0Bi · May 3, 2021, 8:09am

As @wasea already stated, we’re using the Document Undestanding Framework as well.

It’s actually really really good

Kumar_Nelesh · February 24, 2023, 6:59am

Hi can I get a Workflow demo for Document Understanding Using Orchestration Process

Konrad_Mierzwa · February 24, 2023, 9:11am

Hi, go to UiPath Studio start page, click more templates and search for Ducument Understanding Process

Topic		Replies	Views
Document Understanding Process - New Studio Template Product News	45	9340	November 11, 2024
RPA Framework for Document Understanding Document Understanding document_understanding	49	15454	November 19, 2021
Document Understanding Process 21.10 now in General Availability! Document Understanding studio , template , framework	18	3339	May 28, 2023
UiPath Document Understanding - Train ML Models for Document Understanding - Additional exercise Document Understanding feedback	8	1003	May 1, 2024
Document Understanding Framework Template inside Reframework Studio studio , reframework , question , document_understanding , template	2	682	April 4, 2023

Most Active Users - Yesterday
fvalencia
hentsou_mamy
More details...

Document Understanding Best Practises / REFramwork

Related topics