UiPath Document Understanding - Train ML Models for Document Understanding - Additional exercise

Melisa_Miranda · September 5, 2023, 12:00pm

Submit your additional exercise workflow

Upload the following files:

Zip file of workflow solution (dispatcher and Document Understanding)
Extraction results in Excel format including OCR and Extractor Confidence levels.
Automatically generated fine-tune data in zip format.
The dataset exported from the Document Manager (before and after adding fine-tune data) separately as zip files.

Openly discuss any issues or doubts by including screenshots for clarity and receive guidance from UiPath MVP if you encounter roadblocks in the advanced exercise.
Expect swift feedback as the MVP reviews workflows and offers suggestions every Wednesday of the week.

Jai_Pande · December 4, 2023, 8:18am

Purchase_Performer_DU.zip (446.7 KB)

Purchase_Dispatcher_DU.zip (360.1 KB)

ABC Tech PO.xlsx (8.9 KB)
IT Supplies.xlsx (9.1 KB)
Office Pro Suppliers PO.xlsx (8.9 KB)

@Lahiru.Fernando Sir

Lahiru.Fernando · December 11, 2023, 6:04am

Hello @Jai_Pande

Thanks for submitting your solution. Great to see your continuous engagement with the learning plan.

I have reviewed your solution submitted. The dispatcher and the Performer code look OK.

Dispatcher

It’s fine to make the code simple as it is just taking files from the drive. However, you can also implement some mechanisms to check for duplicate file uploads to Queue as a best practice
Its not needed for this assignment. Just sharing some points for you to think about

Performer - DU Flow

The flow looks great. I would however recommend using a For Each/ Parallel For Each loop to loop through the classification results array (output of Classification Station). Irrespective of the classifier we use, it is a best practice to always wrap the activities after classification inside a loop so it works for all types of classification outputs. Additionally, can you also explain the reason to use the Intelligent Keyword Classifier? Was there a specific reason, or you just used it to get more practice?
[Just asking to understand the idea behind it… Nothing wrong in using it ]

Can you also share the output from the Document Manager so I can review how you have done the labeling and training in AI Center?
You can download this from the Document Manager or from the DataSets

Thanks
Lahiru

Melisa_Miranda · December 11, 2023, 10:30am

Hi @Jai_Pande,
Thanks for submitting the workflows for the advanced challenges! Your dedication and effort are truly appreciated. Great going!

ACJS · March 16, 2024, 1:03pm

Hi @Melisa_Miranda @Lahiru.Fernando

Please find the attached workflows and dataset you have mentioned. Also please note the below:

Both dispatcher and performer processes are available in the zip folder
Extracted results are available inside the folder ‘Data\Output’ in performer process
Output of the train extractor activity is available inside the folder ‘Data\Train’ in performer process
Labelled_Dataset_24-03-16T081712.zip (6.6 MB)
DU_TrainMLModelChallenge.zip (1.5 MB)

Waiting for your feedback.
Thanks
Adharsh Chandran

Lahiru.Fernando · March 20, 2024, 1:41pm

Hello @ACJS

Thanks for your efforts in doing the challenge. I will review your solutions and write to you here…

Stay tuned!

khaled.ismail1 · April 3, 2024, 6:49pm

@Lahiru.Fernando, Please find my project submitted as below, I appreciate your feedback, as I found it very strange that in the predictions folder is empty, also can you explain how to use the fine-tuned data exported directly in the dataset to re-train the model again? Thank you very much in advance and really appreciate your answers and feedback.
Exercise_AICenter_DU_Purchase_Orders_Dispatcher.zip (3.8 MB)
Exercise_AICenter_DU_Purchase_Orders_Performer.zip (9.4 MB)
POTraining-Extractor-Fine-Tuning.zip (371.0 KB)
ABC Tech PO_1-1.xlsx (8.8 KB)
IT Supplies_1-1.xlsx (9.0 KB)
Office Pro Suppliers PO_1-1.xlsx (8.9 KB)

khaled.ismail1 · April 3, 2024, 7:33pm

I figured out the re-training part of the fine-tuned data, just it is quite strange for me to see the prediction sub-folder empty, is this something normal?

Lahiru.Fernando · May 1, 2024, 2:44am

Hello…
Sorry for the late reply. It can be empty due to various reasons. It is normal. I have seen this happening too.

You can try uploading these folders into AI Center and running a training pipeline to see if it improves the accuracy.

Let me know what you find… Happy to help further if needed.

Thanks
Lahiru

Topic		Replies	Views
UiPath Document Understanding - The Document Understanding Process Template in Studio - Additional exercise Document Understanding document_understanding , du-process-template	7	719	December 23, 2024
ML extractor trainer Document Understanding activities , question , document_understanding	2	623	June 22, 2023
Document Understanding queries Document Understanding activities , question , document_understanding	1	1104	June 19, 2022
UiPath Document Understanding Machine Learning Classifier Public Endpoint release Product News feedback , document_understanding	7	3008	July 8, 2022
Regarding Document understanding details Document Understanding question , document_understanding	4	914	November 11, 2020

UiPath Document Understanding - Train ML Models for Document Understanding - Additional exercise

Dispatcher

Performer - DU Flow

Related topics