I want to Know how I can Train My ML Extractor

it is creating 3 folders but data is contained only by two document and metadata

prediction folder is blank.

how i can my ML model to go through continues training.

Hi @H_khot ,

Could you check if you have the Alias name provided when configuring the Extractors ?
If not, could you check by adding them ? The same needs to be done using the Train Extractors Scope.
image

Yes you were right, Now it is providing predictions also

I am passed the same Framework Alias to both ML extractor and Trainer

@supermanPunch Can you please help me in How i can make a continues learning ML model.

image

I have exported that folder that contains 3 folder (document, metadata, predictions).

created train pipeline for training the ML skill based on new files.

but it is throwing error. ML package is not proper.

@H_khot ,

Could you provide us a Screenshot of the Configuration done for the Training Pipeline ?

Also, could you provide more details on the ML Logs generated ?

@supermanPunch Thanku for Helping in Advance.

in my machine it creates a folder with name MachineLearningExtractorTrainer

it directly uploads that document to Cloud Ui path

with the same folder in it

the error it is throwing:

Train only of InvoiceExtractionMLPackage 23.4.1.1 launched - Run 87be00d6-dca9-4bd7-9d6b-f4149d99eadc
Train only of InvoiceExtractionMLPackage 23.4.1.1 started - Run 87be00d6-dca9-4bd7-9d6b-f4149d99eadc
Train only of InvoiceExtractionMLPackage 23.4.1.1 scheduled - Run 87be00d6-dca9-4bd7-9d6b-f4149d99eadc
Train only of InvoiceExtractionMLPackage 23.4.1.1 failed - Run 87be00d6-dca9-4bd7-9d6b-f4149d99eadc

Error Details : Pipeline failed due to ML Package Issue

2023-07-29 06:06:29,995 - UiPath_core.trainer_run:main:74 - INFO: Starting training job…
2023-07-29 06:06:32,455 - matplotlib:_get_config_or_cache_dir:526 - WARNING: Matplotlib created a temporary config/cache directory at /tmp/matplotlib-f96glnr9 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2023-07-29 06:06:32,645 - matplotlib.font_manager:_load_fontmanager:1544 - INFO: generated new fontManager
2023-07-29 06:06:33,897 - UiPath_core.storage.azure_storage_client:download:118 - INFO: Dataset from bucket folder training-cfef20ef-6d20-457d-91aa-70327e3e1236/69aa9f81-a0cb-433a-9c3b-4786c091e196/c56970c1-e14f-458e-91f5-d39ae7ba2748/MachineLearningExtractorTrainer with size 4 downloaded successfully
2023-07-29 06:06:33,897 - UiPath_core.training_plugin:train_model:129 - INFO: Start model training…
2023-07-29 06:06:33,897 - UiPath_core.training_plugin:initialize_model:123 - INFO: Start model initialization…
2023-07-29 06:06:33,898 - root:initialize_package:195 - INFO: Using package type provided by runtime argument with value: invoices
2023-07-29 06:06:33,898 - root:initialize_package:204 - INFO: Initializing invoices package options …
2023-07-29 06:06:33,899 - root:_valid_doctype_folder_structure:98 - ERROR: images/ does not exist / is empty for invoices dataset
2023-07-29 06:06:33,899 - UiPath_core.training_plugin:model_run:189 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
2023-07-29 06:06:33,900 - UiPath_core.trainer_run:main:91 - ERROR: Training Job failed, error: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
Traceback (most recent call last):
File “/model/bin/UiPath_core/trainer_run.py”, line 86, in main
wrapper.run()
File “/workspace/model/microservice/training_wrapper.py”, line 64, in run
return self.training_plugin.model_run()
File “/model/bin/UiPath_core/training_plugin.py”, line 205, in model_run
raise ex
File “/model/bin/UiPath_core/training_plugin.py”, line 181, in model_run
self.run_train_only()
File “/model/bin/UiPath_core/training_plugin.py”, line 268, in run_train_only
score = self.train_model(self.local_dataset_directory)
File “/model/bin/UiPath_core/training_plugin.py”, line 131, in train_model
response = self.model.train(directory)
File “/model/bin/UiPath_core/training_plugin.py”, line 119, in model
self.initialize_model()
File “/model/bin/UiPath_core/training_plugin.py”, line 125, in initialize_model
self._model = train.Main()
File “/workspace/model/microservice/train.py”, line 21, in init
self.opt = package_util.initialize_package(args)
File “”, line 206, in initialize_package
File “”, line 144, in get_package_opt
File “”, line 78, in configure_pipeline_options
File “”, line 139, in configure_options
Exception: Document type invoices not valid, check that document type data is in dataset folder and follows folder structure.
2023-07-29 06:06:33,901 - UiPath_core.trainer_run:main:98 - INFO: Job run stopped.

Pipeline Photo

@H_khot ,

Since the Data is directly uploaded, we would need to keep in mind two things.

  1. Exporting the fine tune Folder into the Export Folder
  2. Selecting the Right Folder in the Dataset

For the First part, would recommend you to check the below post :

In the Data Manager - Scheduled Exports feature, Second Paragraph, it talks about how the export happens with the Data that needs to re-trained.

Next, we would need to Select the export folder itself for the Pipeline Re-training, since after performing the export, all the data with the fine tune data will be present in the export folder.

Hope the above explanation /points is understandable.

The Below video was very helpful For Knowing About Fine-tune folder & Training ML Model.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.