English text classification dataset

Varun_Dharni · February 25, 2021, 6:33am

Can anyone please share dataset which I can use English text classification trained model in out of box packages in AI Fabric / AI Center

@ppr @Jeremy_Tederry @nisargkadam23 @NIVED_NAMBIAR @loginerror : Guys can you please help me with this ?

Thanks,
Varun.

nisargkadam23 · February 25, 2021, 7:03am

@Varun_Dharni What is the use case in your mind? Depends on that.

Varun_Dharni · February 25, 2021, 7:05am

@nisargkadam23 just a simple one as if now where we have input and target two columns in csv format and then we try to retrained the model , now can you share the dataset required for the same.

nisargkadam23 · February 25, 2021, 7:06am

@Varun_Dharni find attached data for Restaurant feedback. RestaurantFeedback.xlsx (48.7 KB)

Enjoy!

Make sure to break data in train and test with 80% - 20% ratio.

Varun_Dharni · February 25, 2021, 7:17am

@nisargkadam23 sure mate

thanks,
varun.

nisargkadam23 · February 25, 2021, 7:35am

@Varun_Dharni Please mark comment as solution it helps.

Varun_Dharni · February 25, 2021, 7:41am

done

Varun_Dharni · February 25, 2021, 8:05am

@nisargkadam23 @Jeremy_Tederry

it has been more than 2 hours and my package still stands to be undeployed.

Can you please help me here as well ?

nisargkadam23 · February 25, 2021, 9:48am

@Varun_Dharni Don’t worry about ML Package’s status proceed with Train and Evaluation Pipeline once you deploy ML Skill it will automatically change the status.

Varun_Dharni · February 25, 2021, 11:09am

when I am trying to build pipeline its getting failed , this is happening twice now , any idea ?

Varun_Dharni · February 25, 2021, 11:15am

@nisargkadam23 until pipelines are working fine I wont be able to implement ML Package , is my understanding correct ?

nisargkadam23 · February 25, 2021, 11:17am

@Varun_Dharni Yes you are right about it.

Varun_Dharni · February 25, 2021, 11:20am

@nisargkadam23 what would be the solution when your train pipeline status is coming as Fail ?

highlighted above with an image.

Jeremy_Tederry · February 25, 2021, 11:32am

Do you see any logs on why is this failing? You can check two places, ML Logs and pipeline details (three dots button then details)

Varun_Dharni · February 25, 2021, 11:49am

pipeline_log.txt (10.4 KB)

yes attached is the log file for your reference @Jeremy_Tederry

Thanks,

Varun_Dharni · February 25, 2021, 12:03pm

@Jeremy_Tederry

To be more specific this is the error from the log which I am trying to make out something

2021-02-25 11:01:14,079 - aiflib.data_manager:info:15 - INFO: Loading data from /data/dataset…
2021-02-25 11:01:14,133 - aiflib.data_manager:info:15 - INFO: File [/data/dataset/train.csv] does not have name [input] in header’[‘Input’, ‘Target’]', skipping this file. The csv file must contain a header with at least two columns. The column names are set by the <input_column> and <target_column> variables of this run. The default values are “input” and “target”. If the file contains other columns, they will be ignored
2021-02-25 11:01:14,135 - aiflib.data_manager:info:15 - INFO: Unable to read any valid data from *.csv files in [/data/dataset]
2021-02-25 11:01:14,135 - aiflib.data_manager:info:15 - INFO: Unable to read any valid data from *.json files in [/data/dataset]
2021-02-25 11:01:14,135 - uipath_core.training_plugin:model_run:140 - ERROR: Training failed for pipeline type: TRAIN_ONLY, error: No valid data to run this pipeline.
2021-02-25 11:01:14,165 - uipath_core.trainer_run:main:81 - ERROR: Training Job failed, error: No valid data to run this pipeline.

attached is the file which i have divided into train and test csv format files and then the same have been made part of datasets respectively.

RestaurantFeedback.xls (130.5 KB)

also i just founded one of your answers to the post : AI Fabric Evaluation Fail - #6 by Jeremy_Tederry

but not able to understand what do you exactly meant by saying

" Could you try download the file again and upload it without opening it with Excel in meantime? "

@ppr @nisargkadam23 @Pablito @mgope @loginerror : Just thought if you guys would also like to help me out in this .

Thanks,
Varun.!

sharing screenshot of excel file having the data , this has been converted into csv and then is being used. excel111|260x223

Jeremy_Tederry · February 25, 2021, 2:28pm

You need to rename the column as input and target without capital I think.

Varun_Dharni · February 25, 2021, 3:31pm

@Jeremy_Tederry no luck

Varun_Dharni · February 25, 2021, 3:38pm

@Jeremy_Tederry also ml package status is undeployed still…

Varun_Dharni · February 25, 2021, 5:54pm

@Jeremy_Tederry @nisargkadam23 the issue got resolved when I used two csv files having a little different kind of format to what I was trying earlier

Thanks for your help !!!

appreciate that.

Topic		Replies	Views
Multilabel Text Classification - Training Run Failures - AI Center question , ml	1	811	December 21, 2022
Pipeline failing AI Center question , ai_center	2	1276	May 7, 2021
Pipeline failed due to ML Package Issue \| Using default EnglishTextClassification and CSV AI Center question , ai_center	1	2126	May 4, 2021
No valid data to run this pipeline AI Center	6	2486	October 11, 2021
I am unable to deploy Multilingual text classification package. I have given the input dataset and followed the steps as per documentation. Any suggestions would be great AI Center	10	1078	December 6, 2022

English text classification dataset

Related topics