Pipeline for Data Manager failed

DragosPadurariu · June 16, 2021, 11:43am

Hi !

I’ve created a Pipeline for training the new created Data label and I received the following error:

Train only of DocumentUnderstandingPackage 10.0 launched - Run c9d678c9-3b7f-42c6-af43-c734a7b88c28
Train only of DocumentUnderstandingPackage 10.0 scheduled - Run c9d678c9-3b7f-42c6-af43-c734a7b88c28
Train only of DocumentUnderstandingPackage 10.0 started - Run c9d678c9-3b7f-42c6-af43-c734a7b88c28
Train only of DocumentUnderstandingPackage 10.0 failed - Run c9d678c9-3b7f-42c6-af43-c734a7b88c28

Error Details : Pipeline failed due to ML Package Issue



2021-06-16 11:33:56,826 - uipath_core.trainer_run:main:66 - INFO: Starting training job...
2021-06-16 11:34:01,260 - uipath_core.storage.azure_storage_client:download:95 - INFO: Dataset from bucket folder training-7529d73a-c168-49b8-b630-cd58f97bb25a/a9d84d39-3bb8-49b7-ac04-969186501871/a90fb08e-6e6f-4a8f-9b44-5db98857b25b with size 43 downloaded successfully
2021-06-16 11:34:01,261 - uipath_core.training_plugin:train_model:109 - INFO: Start model training...
2021-06-16 11:34:01,261 - uipath_core.training_plugin:initialize_model:103 - INFO: Start model initialization...
2021-06-16 11:34:01,262 - root:_valid_doctype_folder_structure:63 - ERROR: images/ directory does not exist / is empty for {'name': 'default', 'folder': '', 'dataset': {'account_name': None, 'folder': '', 'path': '/microservice/dataset', 'dataloader_workers': 0, 'vocabulary_padding_id': 0, 'vocabulary_unknown_id': 1, 'text_pp_remove_symbols': False, 'text_pp_lemmatization': False, 'text_pp_remove_stop_words': False, 'word_embedding': 'unknown_id', 'max_words': 10000, 'max_image_size': [300, 300], 'date_format_classifier_data': ['receipts', 'invoices', 'invoices_au', 'invoices_india', 'utility_bills', 'purchase_orders', 'invoices_japan', 'unknown'], 'replace_patterns': ['date', 'number', 'checkbox'], 'doctype2id': {}, 'clftask2id': {}, 'id2clftask': {}, 'clf_tasks_by_doctype': defaultdict(<class 'list'>, {})}, 'path': '/microservice/dataset/', 'split': '/microservice/dataset/split.csv', 'schema': '/microservice/dataset/schema.json'} dataset
2021-06-16 11:34:01,263 - uipath_core.training_plugin:model_run:145 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
2021-06-16 11:34:01,267 - uipath_core.trainer_run:main:81 - ERROR: Training Job failed, error: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
Traceback (most recent call last):
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/trainer_run.py", line 76, in main
wrapper.run()
File "/microservice/training_wrapper.py", line 57, in run
return self.training_plugin.model_run()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 146, in model_run
raise e
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 138, in model_run
self.run_train_only()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 207, in run_train_only
self.train_model(self.local_dataset_directory)
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 111, in train_model
self.model.train(directory)
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 99, in model
self.initialize_model()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 105, in initialize_model
self._model = train.Main()
File "/microservice/train.py", line 24, in __init__
self.opt = self.get_options()
File "/microservice/train.py", line 105, in get_options
opt = preprocess.configure_options(opt)
File "<frozen extraction.model_tag.preprocess>", line 99, in configure_options
Exception: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
2021-06-16 11:34:10,093 - uipath_core.trainer_run:main:66 - INFO: Starting training job...
2021-06-16 11:34:13,486 - uipath_core.logs.upload_log_service:upload_logs_file:56 - INFO: Retry Training Triggered:
2021-06-16 11:34:14,395 - uipath_core.storage.azure_storage_client:download:95 - INFO: Dataset from bucket folder training-7529d73a-c168-49b8-b630-cd58f97bb25a/a9d84d39-3bb8-49b7-ac04-969186501871/a90fb08e-6e6f-4a8f-9b44-5db98857b25b with size 43 downloaded successfully
2021-06-16 11:34:14,395 - uipath_core.training_plugin:train_model:109 - INFO: Start model training...
2021-06-16 11:34:14,395 - uipath_core.training_plugin:initialize_model:103 - INFO: Start model initialization...
2021-06-16 11:34:14,397 - root:_valid_doctype_folder_structure:63 - ERROR: images/ directory does not exist / is empty for {'name': 'default', 'folder': '', 'dataset': {'account_name': None, 'folder': '', 'path': '/microservice/dataset', 'dataloader_workers': 0, 'vocabulary_padding_id': 0, 'vocabulary_unknown_id': 1, 'text_pp_remove_symbols': False, 'text_pp_lemmatization': False, 'text_pp_remove_stop_words': False, 'word_embedding': 'unknown_id', 'max_words': 10000, 'max_image_size': [300, 300], 'date_format_classifier_data': ['receipts', 'invoices', 'invoices_au', 'invoices_india', 'utility_bills', 'purchase_orders', 'invoices_japan', 'unknown'], 'replace_patterns': ['date', 'number', 'checkbox'], 'doctype2id': {}, 'clftask2id': {}, 'id2clftask': {}, 'clf_tasks_by_doctype': defaultdict(<class 'list'>, {})}, 'path': '/microservice/dataset/', 'split': '/microservice/dataset/split.csv', 'schema': '/microservice/dataset/schema.json'} dataset
2021-06-16 11:34:14,397 - uipath_core.training_plugin:model_run:145 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
2021-06-16 11:34:14,402 - uipath_core.trainer_run:main:81 - ERROR: Training Job failed, error: Document type default not valid, check that document type data is in dataset folder and follows folder structure.
Traceback (most recent call last):
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/trainer_run.py", line 76, in main
wrapper.run()
File "/microservice/training_wrapper.py", line 57, in run
return self.training_plugin.model_run()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 146, in model_run
raise e
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 138, in model_run
self.run_train_only()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 207, in run_train_only
self.train_model(self.local_dataset_directory)
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 111, in train_model
self.model.train(directory)
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 99, in model
self.initialize_model()
File "/home/aifabric/.local/lib/python3.8/site-packages/uipath_core/training_plugin.py", line 105, in initialize_model
self._model = train.Main()
File "/microservice/train.py", line 24, in __init__
self.opt = self.get_options()
File "/microservice/train.py", line 105, in get_options
opt = preprocess.configure_options(opt)
File "<frozen extraction.model_tag.preprocess>", line 99, in configure_options
Exception: Document type default not valid, check that document type data is in dataset folder and follows folder structure.

I wanted the mention that the files from Data Manager are exported by clicking the Export button.

Many thanks !

system · June 18, 2021, 4:01pm

Hello @DragosPadurariu!

It seems that you have trouble getting an answer to your question in the first 24 hours.
Let us give you a few hints and helpful links.

First, make sure you browsed through our Forum FAQ Beginner’s Guide. It will teach you what should be included in your topic.

You can check out some of our resources directly, see below:

Always search first. It is the best way to quickly find your answer. Check out the icon for that.
Clicking the options button will let you set more specific topic search filters, i.e. only the ones with a solution.
Topic that contains most common solutions with example project files can be found here.
Read our official documentation where you can find a lot of information and instructions about each of our products:
Watch the videos on our official YouTube channel for more visual tutorials.
Meet us and our users on our Community Slack and ask your question there.

Hopefully this will let you easily find the solution/information you need. Once you have it, we would be happy if you could share your findings here and mark it as a solution. This will help other users find it in the future.

Thank you for helping us build our UiPath Community!

Cheers from your friendly
Forum_Staff

karthi_pri · July 3, 2021, 8:14am

Select the subfolder folder directly while creating pipeline instead of selecting InvoiceDataSet.

raf667 · August 3, 2021, 10:33am

Hi Dragos,

I encountered the same issue. Did you find the reason for it?

Thanks,
Iulian

abhilash.bhanwal · October 21, 2021, 6:17pm

Facing same issue. Anybody got any resolution?

sven.boettcher · October 21, 2021, 10:49pm

(post deleted by author)

sven.boettcher · October 22, 2021, 10:23am

You need to add at least 20 files to the training set in data manager. Then it should work.

eric.marciano · November 5, 2021, 3:11pm

HI,

I have the exact same error and I have only 11 samples documents for training. Is there something I can do? Is there any workaround?

Thanks,
Eric

Gaurav_Malhotra · November 9, 2021, 10:25am

Donot select the complete DataSet while creating pipeline. Structure of data set should be like:
Dataset>Export>packageName

Singh.Sourav343 · December 24, 2021, 7:01am

Hi,
Bro, Actually you need to upload a Folder Directly instead of a Zip file trained in Datamanager inside the dataset. And provide the same subfolder in package and Pipeline creation.
It will work. I was getting the same issue got resolved.

Hope this clears your doubt bro

Thanks
Sourav Singh

Galina_Moore · December 30, 2021, 10:22pm

Hello, I’m also having this issue even after following the suggestions on here. I exported the data from Data Manager and it create a new folder in the Data Set → Export path.

However when I create a new pipeline, I can only select “InvoiceDataSet2” and not the specific folder I just exported.

What else am I missing?

Thanks.

system · March 18, 2022, 12:40pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with Training Data AI Center	7	1629	May 20, 2022
Custom Model Training pipeline Failed ( No such file or directory) AI Center question , ai_center	10	639	April 27, 2023
Failure creating training pipeline AI Center	7	715	November 2, 2022
AI Center Training Problem AI Center question , advanced_training	3	171	January 23, 2024
Failed Training Pipeline: Retraining an older ML Package AI Center question , ai_center	1	672	November 9, 2022

Most Active Users - Yesterday
ashokkarale
Anil_G
Yoichi
yangyq10
postwick
chandreshsinh.jadeja
aravindbalineni123
Parvathy
aya
PRASHANT_GABHANE
More details...

Pipeline for Data Manager failed

Related Topics