UiPath Document Understanding - Train ML Models for Document Understanding - Additional exercise

Submit your additional exercise workflow

  1. Upload the following files:
  • Zip file of workflow solution (dispatcher and Document Understanding)
  • Extraction results in Excel format including OCR and Extractor Confidence levels.
  • Automatically generated fine-tune data in zip format.
  • The dataset exported from the Document Manager (before and after adding fine-tune data) separately as zip files.
  1. Openly discuss any issues or doubts by including screenshots for clarity and receive guidance from UiPath MVP if you encounter roadblocks in the advanced exercise.

  2. Expect swift feedback as the MVP reviews workflows and offers suggestions every Wednesday of the week.

1 Like

Purchase_Performer_DU.zip (446.7 KB)

Purchase_Dispatcher_DU.zip (360.1 KB)

ABC Tech PO.xlsx (8.9 KB)
IT Supplies.xlsx (9.1 KB)
Office Pro Suppliers PO.xlsx (8.9 KB)

@Lahiru.Fernando Sir

2 Likes

Hello @Jai_Pande

Thanks for submitting your solution. Great to see your continuous engagement with the learning plan. :slight_smile:

I have reviewed your solution submitted. The dispatcher and the Performer code look OK.

Dispatcher

It’s fine to make the code simple as it is just taking files from the drive. However, you can also implement some mechanisms to check for duplicate file uploads to Queue as a best practice :slight_smile:
Its not needed for this assignment. Just sharing some points for you to think about :slight_smile:

Performer - DU Flow

The flow looks great. I would however recommend using a For Each/ Parallel For Each loop to loop through the classification results array (output of Classification Station). Irrespective of the classifier we use, it is a best practice to always wrap the activities after classification inside a loop so it works for all types of classification outputs. Additionally, can you also explain the reason to use the Intelligent Keyword Classifier? Was there a specific reason, or you just used it to get more practice?
[Just asking to understand the idea behind it… Nothing wrong in using it :slight_smile: ]

Can you also share the output from the Document Manager so I can review how you have done the labeling and training in AI Center?
You can download this from the Document Manager or from the DataSets

Thanks
Lahiru

1 Like

Hi @Jai_Pande,
Thanks for submitting the workflows for the advanced challenges! Your dedication and effort are truly appreciated. Great going!

Hi @Melisa_Miranda @Lahiru.Fernando

Please find the attached workflows and dataset you have mentioned. Also please note the below:

  1. Both dispatcher and performer processes are available in the zip folder
  2. Extracted results are available inside the folder ‘Data\Output’ in performer process
  3. Output of the train extractor activity is available inside the folder ‘Data\Train’ in performer process
    Labelled_Dataset_24-03-16T081712.zip (6.6 MB)
    DU_TrainMLModelChallenge.zip (1.5 MB)

Waiting for your feedback.
Thanks
Adharsh Chandran

1 Like

Hello @ACJS

Thanks for your efforts in doing the challenge. I will review your solutions and write to you here… :slight_smile:

Stay tuned!

@Lahiru.Fernando, Please find my project submitted as below, I appreciate your feedback, as I found it very strange that in the predictions folder is empty, also can you explain how to use the fine-tuned data exported directly in the dataset to re-train the model again? Thank you very much in advance and really appreciate your answers and feedback.
Exercise_AICenter_DU_Purchase_Orders_Dispatcher.zip (3.8 MB)
Exercise_AICenter_DU_Purchase_Orders_Performer.zip (9.4 MB)
POTraining-Extractor-Fine-Tuning.zip (371.0 KB)
ABC Tech PO_1-1.xlsx (8.8 KB)
IT Supplies_1-1.xlsx (9.0 KB)
Office Pro Suppliers PO_1-1.xlsx (8.9 KB)

I figured out the re-training part of the fine-tuned data, just it is quite strange for me to see the prediction sub-folder empty, is this something normal?

1 Like

Hello…
Sorry for the late reply. It can be empty due to various reasons. It is normal. I have seen this happening too.

You can try uploading these folders into AI Center and running a training pipeline to see if it improves the accuracy.

Let me know what you find… Happy to help further if needed.

Thanks
Lahiru