LightTextClassification ML Skill

Jon_Smith · August 25, 2021, 7:27am

Hi,

I am trying to use the UiPath supported ML packages, specifically the LightTextClassification package as per the documentation here:

I used an example CSV full of basically a sentiment analysis with all the entries filtered into positive or negative and made a training pipeline.
Problem is when I deploy it as a skill I don’t get the responses as indicated in the documentation. Instead of a result like
{
“prediction”:
“Positive”, “confidence”: 0.9422031841278076
}

I am getting

{
“class”: “0”,
“confidence”: 0.84942038749394477
}

Every single entry returns a ‘class’ as zero but the confidence does change.
If I use the ML package of EnglishTextClassification, train it on the same file then everything works correctly and I get a Json response with the prediction and the confidence.

Can someone tell me what I’m doing wrong or what is wrong in the documentation? Or is the Package incorrect?

Senne_Symons · August 31, 2021, 1:15pm

I can’t help you with a solution, but I can confirm I’m having the exact same issue with the LightTextClassification skill.

Jon_Smith · August 31, 2021, 1:32pm

I should have followed this up.

The issue I encountered was related to inconsistent behaviour between similar ML skills. The EnglishTextClassification seems to be much better, it handles any CSV file (aswell as other formats) you provide in your dataset and allows you to set the input parameters for the ‘input’ and ‘target’ columns. It will fail and give you good details when it does fail

LightTextClassification on the other hand, well that needs a file in a very specific format. You have to use the exact column names AND exact file name in the instructions “dataset.csv”.
If you put in a different file it seems to still ‘train’ but then does the results as I explained above.

Please try again with the input file as I describe and I think you’ll have the same success as I did.

Partly my fault for not reading the instructions to such a fine detail, partly the models fault for having poor error handling I think.

Senne_Symons · August 31, 2021, 1:49pm

Hi Jon,

Thanks for the follow-up!

I can’t use the EnglishTextClassification model as my input can be in multiple language. I’d love to use the MultiLingualTextClassification, but I get errors when I try to deploy that one.

So what you’re saying is that my csv file to train should be called “dataset.csv” and not “train.csv”?
And it should have 2 columns, input & target?

Jon_Smith · August 31, 2021, 2:05pm

Same reason I was looking to use the other skills, non English data aswell.

Yes, try changing your filename to that and make sure the columns are input and target.
Its in the documentation page but as I mentioned, the other skills don’t have these hard requirements so its confusing when trying out different ones and getting these inconsistent results.

Senne_Symons · August 31, 2021, 2:15pm

Does the LightTextClassification model support multiple languages? Because the documentation is also quite vague on this…

It supports all languages based on Latin characters, such as English, French, Spanish, and others.

Do you happen to have any experience with the MultiLingual model?

Jon_Smith · August 31, 2021, 2:29pm

I’ve only done a basic proof of concept on it, I need alot more data from my customer to stress test the model so I’d be interested in hearing your results with non English languages aswell.

If I have enough licences I might try doing a comparison between the English one, LightText and the Text one (I think there are 3) to see if I get wildly different results with this language.

Perhaps if one of us can get our hands on a ‘sentiment analysis’ dataset in a none English language we could see. I have one in English that maybe could be translated… its 1000 entries however so something to be done in batch and not by hand which relies on Google translate being accurate.

Senne_Symons · August 31, 2021, 2:32pm

Could you share some information about how you trained & deployed the MultiLingual model?

I’m having issues once deployed, as shown in this forum post.

I’ve been waiting for UiPath Support to help but they are slow in responding.

Jeremy_Tederry · August 31, 2021, 4:30pm

Hi Jon
That’s weird, would you mind sharing your dataset?
Jeremy

Jon_Smith · September 1, 2021, 7:36am

@Senne_Symons Ah, I realize I wasn’t clear Senne, I meant I only did a POC with the LightTextClassificationSkills, I have also tested other unrelated skills but I haven’t done the multi language ones.

@Jeremy_Tederry
Are you referring to the original issue I posted with the LightTextClassification? As I explained, it does this behaviour if you don’t name the file name dataset.csv.
Here is a file I was testing with that is a basic sentiment analysis of some text, if you name the file whatever.csv you should get the same results as I reported in my initial post (no errors) if you name it dataset.csv then it works.
Processing: dataset2.csv…
dataset2.zip (20.1 KB)

Topic		Replies	Views
I am unable to deploy Multilingual text classification package. I have given the input dataset and followed the steps as per documentation. Any suggestions would be great AI Center	10	1093	December 6, 2022
Sentiment Analysis in Spanish AI Center question , ai_center	3	762	July 30, 2022
Document Understanding : OOB Machine Learning Classifier Document Understanding	5	1153	October 3, 2022
Pipeline failed due to ML Package Issue \| Using default EnglishTextClassification and CSV AI Center question , ai_center	1	2134	May 4, 2021
Unable to read the ML skill values of light text classification package AI Center question , ai_center	2	388	May 23, 2023

LightTextClassification ML Skill

Related topics