Format to create training set and test set for ML package in AI fabric

Hi

I was watching a video where a training set and test set is build in excel sheet for classification of emails based on textual content inside it.

Can any one share format in which these excel data can be created for training and test ?

Also in case we want to create a training set containing images in different folders for example various sub folders each containing specific images of Animals, house, cars, fruits etc. Can it be done, if yes, please suggest how.

Thanks
Ankit

Hello ankit,
The schema of the excel file or an other input file (json or csv or …) is dependent on how the data scientist has built the ML Package. It vary from one package to another. So you need to ask the data scientist to understand what kind of inputs the model would be expecting.
For example you could refer to this video, which shows the training and test data sets for english text classification model.

@botBotGo can you please let me know from where can I download the training and testing dataset you talked about in the following video (2) UiPath AI Fabric - 3 | Deploy and Train ML Packages in AI Fabric | Add Datasets & Create Pipelines - YouTube

Thanks,
Varun.

botBotGo Is der any update on this? I’m using uipath open source ML model. I have to classify mails based on the tickets they are raising. like this video UiPath AI Center: Automating Complaint Classification Process - YouTube

its in CSV format. can you let me know the structure on how to create a training and testing data set.

I Believe it should have 2 columns in CSV file i.e Input and Target
input → Email Body
Target → Expected Result

example:
input : I have derogatory information on my three credit bureaus from XXXX, XXXX and Experian that are not correct I’ve tried numerous of times contacting and speaking to the credit bureaus about the negative information there reporting on my credit report and I have also complained to them that someone who I know and trusted which was a family member that used my personal information to obtain credit using my personal information also there are lots of hard inquiries being reported on my credit reports which I didn’t authorize could someone please help me out with these problems I’m having thank you.

Target : Credit Card

I’ve taken couple of days on searching aricles and videos.All of them skipped the most important step - “How to build/create our own test data and train data” which can be used by out of box UiPath ML package training before we clould successfully deploy a MLskill. I really appreciate if someone could help. Thanks.