Orchestration of Document Understanding Model

azeem_rosli · January 24, 2023, 12:54pm

Hi,

I’m using Intelligent Classifier and Form Extractor for document extraction. Without having to publish my project every time I train the classifier and form extractor, how can I store my training data in cloud so that I can test the classification and extraction in a development environment? Moreover, where does the training dataset for Form Extractor was stored?

supermanPunch · January 27, 2023, 4:07pm

Hi @azeem_rosli ,

I believe for the Classification part, we do have the Keyword.Json/IntelligentKeyword.Json file created during the Implementation/Configuration of Classifiers.

To be able to use this in a different environment once deployed, you might need to add a Step in the Process to Upload this file into a Storage Bucket in Orchestrator. In this way, Every time the Project/Classifier is being used, an updated version of the file will also exist in the Orchestrator, which you should be able to use in your Other Environment by downloading it.

For the Form Extractor training, there isn’t actually a training happening on it when different documents are being fed. It is simply performing the Extraction based on the earlier rules defined. Also, there aren’t Training files defined or being used for it.

azeem_rosli · January 27, 2023, 8:49pm

Hi @supermanPunch, thanks for the reply

Have you ever use Action Center for training DU? I just wonder, does Document Validation able to train the classifier too? If yes, I would like to know if there’s any limitation on that method.

However, this solution is great. I would like to try it, if I could not train my classifier using Action Center.

Yeah, you’re right about this. I should replace it with ML Extractor.

Regards,
Azeem

supermanPunch · January 28, 2023, 6:47pm

@azeem_rosli , Yes. We did use for couple of cases the Validation/Classification Station for Manual Classification or Extraction from the user. By using the Action Center also, we are able to train the Classifiers. But the trained info is present only on the Training.json file (if Classifier Trainer Scope is used).

I believe you would need to use the Suggested method for now for using the Latest Training files for Testing in Dev Environment. Although this could be a Feedback to the UiPath DU Team for maybe considering an easier way around this situation if possible.

Topic		Replies	Views
Training classifiers and extractors from multiple processes Document Understanding	3	551	May 31, 2023
Unable to Train my Extractor in Document Understanding Workflow Document Understanding question , document_understanding , ai_center	4	1778	April 7, 2021
Supervised learning: How to persist data definition with activity UiPath.IntelligentOCR.Activities.ValidationStation.PresentValidationStation Help activities , question , ai	10	2432	July 23, 2020
Document understanding classic project Studio studio , question , document_understanding , project_panel	4	136	July 29, 2024
Document understanding action center data extraction in workflow Document Understanding	5	98	November 27, 2024

Orchestration of Document Understanding Model

Related topics