Hi,
I am getting the below error when creating a training pipeline. I am using out-of-the-box purchase orders model. Earlier I got an error saying schema.json file not found. I found out that it can happen if I zip my project and unzip in windows. This issue got resolved after I did another project from the scratch. Then I got an error saying latest not found. So I created a latest folder and then that was ok. Now I am getting the below error.
2020-09-12 09:26:14,542 - main:main:69 - INFO: Starting training job…
2020-09-12 09:26:14,543 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: list_blobs : 0.00010085105895996094
2020-09-12 09:26:14,548 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: list_blobs : 4.76837158203125e-05
2020-09-12 09:26:14,756 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: upload : 0.11476397514343262
2020-09-12 09:26:14,943 - wrapper.gcp_storage_client:download:86 - INFO: Dataset from bucket folder training-6be660f5-d47e-4657-93ba-089e51dd374d/0aed828e-48c5-467b-a1c8-9507efa4b146/27a1f771-d311-4218-9cb6-52ebf5efb554 with size 5 downloaded successfully
2020-09-12 09:26:14,943 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: download : 0.39600515365600586
2020-09-12 09:26:14,944 - wrapper.training_wrapper:train_model:94 - INFO: Start model training…
2020-09-12 09:26:14,944 - wrapper.training_wrapper:initialize_model:88 - INFO: Start model initialization…
2020-09-12 09:26:14,945 - wrapper.training_wrapper:initialize_model:91 - INFO: Model initialized successfully
2020-09-12 09:26:14,946 - root:preprocess_data:318 - INFO: Create Dataset
2020-09-12 09:26:14,946 - root:_train:115 - ERROR: Dataset Creation Failed
Traceback (most recent call last):
File “/training/extraction/model_tag/train.py”, line 113, in _train
df_train, df_test = preprocess.preprocess_data(opt)
File “/training/extraction/model_tag/preprocess.py”, line 319, in preprocess_data
errors, report = download_data.generate_tag_report(opt[“dataset”][“path”])
File “/training/extraction/webapp_tagger/download_data.py”, line 365, in generate_tag_report
for line in doc[“words”]:
KeyError: ‘words’
2020-09-12 09:26:14,947 - wrapper.training_wrapper:run:143 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Dataset Creation Failed
2020-09-12 09:26:14,951 - main:main:78 - ERROR: Training Job failed, error: Dataset Creation Failed
Traceback (most recent call last):
File “/training/extraction/model_tag/train.py”, line 113, in _train
df_train, df_test = preprocess.preprocess_data(opt)
File “/training/extraction/model_tag/preprocess.py”, line 319, in preprocess_data
errors, report = download_data.generate_tag_report(opt[“dataset”][“path”])
File “/training/extraction/webapp_tagger/download_data.py”, line 365, in generate_tag_report
for line in doc[“words”]:
KeyError: ‘words’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “trainer_run.py”, line 73, in main
wrapper.run()
File “/training/wrapper/training_wrapper.py”, line 144, in run
raise e
File “/training/wrapper/training_wrapper.py”, line 136, in run
self.run_train_only()
File “/training/wrapper/training_wrapper.py”, line 205, in run_train_only
self.train_model(self.local_dataset_directory)
File “/training/wrapper/training_wrapper.py”, line 96, in train_model
self.model.train(directory)
File “/training/train.py”, line 24, in train
train_local._train(self.opt, self.df_train, self.df_test)
File “/training/extraction/model_tag/train.py”, line 117, in _train
raise Exception(“Dataset Creation Failed”)
Exception: Dataset Creation Failed
2020-09-12 09:26:24,692 - main:main:69 - INFO: Starting training job…
2020-09-12 09:26:24,692 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: list_blobs : 8.630752563476562e-05
2020-09-12 09:26:24,694 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: list_blobs : 4.00543212890625e-05
2020-09-12 09:26:24,798 - wrapper.upload_log_service:upload_logs_file:52 - INFO: Retry Training Triggered:
2020-09-12 09:26:24,948 - wrapper.gcp_storage_client:download:86 - INFO: Dataset from bucket folder training-6be660f5-d47e-4657-93ba-089e51dd374d/0aed828e-48c5-467b-a1c8-9507efa4b146/27a1f771-d311-4218-9cb6-52ebf5efb554 with size 5 downloaded successfully
2020-09-12 09:26:24,948 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: download : 0.2542102336883545
2020-09-12 09:26:24,948 - wrapper.training_wrapper:train_model:94 - INFO: Start model training…
2020-09-12 09:26:24,949 - wrapper.training_wrapper:initialize_model:88 - INFO: Start model initialization…
2020-09-12 09:26:24,950 - wrapper.training_wrapper:initialize_model:91 - INFO: Model initialized successfully
2020-09-12 09:26:24,951 - root:preprocess_data:318 - INFO: Create Dataset
2020-09-12 09:26:24,951 - root:_train:115 - ERROR: Dataset Creation Failed
Traceback (most recent call last):
File “/training/extraction/model_tag/train.py”, line 113, in _train
df_train, df_test = preprocess.preprocess_data(opt)
File “/training/extraction/model_tag/preprocess.py”, line 319, in preprocess_data
errors, report = download_data.generate_tag_report(opt[“dataset”][“path”])
File “/training/extraction/webapp_tagger/download_data.py”, line 365, in generate_tag_report
for line in doc[“words”]:
KeyError: ‘words’
2020-09-12 09:26:24,952 - wrapper.training_wrapper:run:143 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Dataset Creation Failed
2020-09-12 09:26:24,953 - main:main:78 - ERROR: Training Job failed, error: Dataset Creation Failed
Traceback (most recent call last):
File “/training/extraction/model_tag/train.py”, line 113, in _train
df_train, df_test = preprocess.preprocess_data(opt)
File “/training/extraction/model_tag/preprocess.py”, line 319, in preprocess_data
errors, report = download_data.generate_tag_report(opt[“dataset”][“path”])
File “/training/extraction/webapp_tagger/download_data.py”, line 365, in generate_tag_report
for line in doc[“words”]:
KeyError: ‘words’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “trainer_run.py”, line 73, in main
wrapper.run()
File “/training/wrapper/training_wrapper.py”, line 144, in run
raise e
File “/training/wrapper/training_wrapper.py”, line 136, in run
self.run_train_only()
File “/training/wrapper/training_wrapper.py”, line 205, in run_train_only
self.train_model(self.local_dataset_directory)
File “/training/wrapper/training_wrapper.py”, line 96, in train_model
self.model.train(directory)
File “/training/train.py”, line 24, in train
train_local._train(self.opt, self.df_train, self.df_test)
File “/training/extraction/model_tag/train.py”, line 117, in _train
raise Exception(“Dataset Creation Failed”)
Exception: Dataset Creation Failed
2020-09-12 09:26:53,100 - main:main:69 - INFO: Starting training job…
2020-09-12 09:26:53,101 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: list_blobs : 9.679794311523438e-05
2020-09-12 09:26:53,104 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: list_blobs : 5.91278076171875e-05
2020-09-12 09:26:53,204 - wrapper.upload_log_service:upload_logs_file:52 - INFO: Retry Training Triggered:
2020-09-12 09:26:53,393 - wrapper.gcp_storage_client:download:86 - INFO: Dataset from bucket folder training-6be660f5-d47e-4657-93ba-089e51dd374d/0aed828e-48c5-467b-a1c8-9507efa4b146/27a1f771-d311-4218-9cb6-52ebf5efb554 with size 5 downloaded successfully
2020-09-12 09:26:53,394 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: download : 0.2896537780761719
2020-09-12 09:26:53,394 - wrapper.training_wrapper:train_model:94 - INFO: Start model training…
2020-09-12 09:26:53,394 - wrapper.training_wrapper:initialize_model:88 - INFO: Start model initialization…
2020-09-12 09:26:53,395 - wrapper.training_wrapper:initialize_model:91 - INFO: Model initialized successfully
2020-09-12 09:26:53,396 - root:preprocess_data:318 - INFO: Create Dataset
2020-09-12 09:26:53,397 - root:_train:115 - ERROR: Dataset Creation Failed
Traceback (most recent call last):
File “/training/extraction/model_tag/train.py”, line 113, in _train
df_train, df_test = preprocess.preprocess_data(opt)
File “/training/extraction/model_tag/preprocess.py”, line 319, in preprocess_data
errors, report = download_data.generate_tag_report(opt[“dataset”][“path”])
File “/training/extraction/webapp_tagger/download_data.py”, line 365, in generate_tag_report
for line in doc[“words”]:
KeyError: ‘words’
2020-09-12 09:26:53,398 - wrapper.training_wrapper:run:143 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Dataset Creation Failed
2020-09-12 09:26:53,398 - main:main:78 - ERROR: Training Job failed, error: Dataset Creation Failed
Traceback (most recent call last):
File “/training/extraction/model_tag/train.py”, line 113, in _train
df_train, df_test = preprocess.preprocess_data(opt)
File “/training/extraction/model_tag/preprocess.py”, line 319, in preprocess_data
errors, report = download_data.generate_tag_report(opt[“dataset”][“path”])
File “/training/extraction/webapp_tagger/download_data.py”, line 365, in generate_tag_report
for line in doc[“words”]:
KeyError: ‘words’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “trainer_run.py”, line 73, in main
wrapper.run()
File “/training/wrapper/training_wrapper.py”, line 144, in run
raise e
File “/training/wrapper/training_wrapper.py”, line 136, in run
self.run_train_only()
File “/training/wrapper/training_wrapper.py”, line 205, in run_train_only
self.train_model(self.local_dataset_directory)
File “/training/wrapper/training_wrapper.py”, line 96, in train_model
self.model.train(directory)
File “/training/train.py”, line 24, in train
train_local._train(self.opt, self.df_train, self.df_test)
File “/training/extraction/model_tag/train.py”, line 117, in _train
raise Exception(“Dataset Creation Failed”)
Exception: Dataset Creation Failed
2020-09-12 09:27:08,767 - main:main:69 - INFO: Starting training job…
2020-09-12 09:27:08,767 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: list_blobs : 9.489059448242188e-05
2020-09-12 09:27:08,770 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: list_blobs : 4.482269287109375e-05
2020-09-12 09:27:08,872 - wrapper.upload_log_service:upload_logs_file:52 - INFO: Retry Training Triggered:
2020-09-12 09:27:09,000 - wrapper.gcp_storage_client:download:86 - INFO: Dataset from bucket folder training-6be660f5-d47e-4657-93ba-089e51dd374d/0aed828e-48c5-467b-a1c8-9507efa4b146/27a1f771-d311-4218-9cb6-52ebf5efb554 with size 5 downloaded successfully
2020-09-12 09:27:09,000 - wrapper.utils:_retries:20 - INFO: Total time taken to execute func: download : 0.23051238059997559
2020-09-12 09:27:09,001 - wrapper.training_wrapper:train_model:94 - INFO: Start model training…
2020-09-12 09:27:09,001 - wrapper.training_wrapper:initialize_model:88 - INFO: Start model initialization…
2020-09-12 09:27:09,002 - wrapper.training_wrapper:initialize_model:91 - INFO: Model initialized successfully
2020-09-12 09:27:09,003 - root:preprocess_data:318 - INFO: Create Dataset
2020-09-12 09:27:09,003 - root:_train:115 - ERROR: Dataset Creation Failed
Traceback (most recent call last):
File “/training/extraction/model_tag/train.py”, line 113, in _train
df_train, df_test = preprocess.preprocess_data(opt)
File “/training/extraction/model_tag/preprocess.py”, line 319, in preprocess_data
errors, report = download_data.generate_tag_report(opt[“dataset”][“path”])
File “/training/extraction/webapp_tagger/download_data.py”, line 365, in generate_tag_report
for line in doc[“words”]:
KeyError: ‘words’
2020-09-12 09:27:09,004 - wrapper.training_wrapper:run:143 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: Dataset Creation Failed
2020-09-12 09:27:09,005 - main:main:78 - ERROR: Training Job failed, error: Dataset Creation Failed
Traceback (most recent call last):
File “/training/extraction/model_tag/train.py”, line 113, in _train
df_train, df_test = preprocess.preprocess_data(opt)
File “/training/extraction/model_tag/preprocess.py”, line 319, in preprocess_data
errors, report = download_data.generate_tag_report(opt[“dataset”][“path”])
File “/training/extraction/webapp_tagger/download_data.py”, line 365, in generate_tag_report
for line in doc[“words”]:
KeyError: ‘words’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “trainer_run.py”, line 73, in main
wrapper.run()
File “/training/wrapper/training_wrapper.py”, line 144, in run
raise e
File “/training/wrapper/training_wrapper.py”, line 136, in run
self.run_train_only()
File “/training/wrapper/training_wrapper.py”, line 205, in run_train_only
self.train_model(self.local_dataset_directory)
File “/training/wrapper/training_wrapper.py”, line 96, in train_model
self.model.train(directory)
File “/training/train.py”, line 24, in train
train_local._train(self.opt, self.df_train, self.df_test)
File “/training/extraction/model_tag/train.py”, line 117, in _train
raise Exception(“Dataset Creation Failed”)
Exception: Dataset Creation Failed
schema.json was downloaded from ‘configuring data manager’ page. This is a bit urgent and a prompt response would be really appreciated.
Thanks and Regards,
Kolitha