Orchestration of Document Understanding Model


I’m using Intelligent Classifier and Form Extractor for document extraction. Without having to publish my project every time I train the classifier and form extractor, how can I store my training data in cloud so that I can test the classification and extraction in a development environment? Moreover, where does the training dataset for Form Extractor was stored?

Hi @azeem_rosli ,

I believe for the Classification part, we do have the Keyword.Json/IntelligentKeyword.Json file created during the Implementation/Configuration of Classifiers.

To be able to use this in a different environment once deployed, you might need to add a Step in the Process to Upload this file into a Storage Bucket in Orchestrator. In this way, Every time the Project/Classifier is being used, an updated version of the file will also exist in the Orchestrator, which you should be able to use in your Other Environment by downloading it.

For the Form Extractor training, there isn’t actually a training happening on it when different documents are being fed. It is simply performing the Extraction based on the earlier rules defined. Also, there aren’t Training files defined or being used for it.

Hi @supermanPunch, thanks for the reply :slight_smile:

Have you ever use Action Center for training DU? I just wonder, does Document Validation able to train the classifier too? If yes, I would like to know if there’s any limitation on that method.

However, this solution is great. I would like to try it, if I could not train my classifier using Action Center.

Yeah, you’re right about this. I should replace it with ML Extractor.


@azeem_rosli , Yes. We did use for couple of cases the Validation/Classification Station for Manual Classification or Extraction from the user. By using the Action Center also, we are able to train the Classifiers. But the trained info is present only on the Training.json file (if Classifier Trainer Scope is used).

I believe you would need to use the Suggested method for now for using the Latest Training files for Testing in Dev Environment. Although this could be a Feedback to the UiPath DU Team for maybe considering an easier way around this situation if possible.