Remove special characters and emojis from string

Hi everyone,

I am doing a data scrape of social media posts. Because a lot of the posts are posted on phones, the text may include emojis.

System.Text.RegularExpressions.Regex.Replace(variable, “[^a-z A-Z 0-9]”, “”)
The above regex unfortunately will leave me with only text and numbers
I want a way to be able basically keep that and only the basic characters that can be typed on a keyboard such as hashtag signs

Hi @dvn,

Pls provide same input and output.

Above regex only keep only numbers and words

Try below regex
[^\W .]
Remove special character apart from dot.if you want more you can add it after dot

Regards,
Arivu

@dvn - Check this out… Below regex covers all the keyboard characters typed on the keyborad…

Regex: [\s0-9A-Za-z#$%=@!{},`~&()'<>?.:;_|^/+\t\r\n[]"-]

Note: In UiPath, you have to provide like this since it is having double quotes

“[\s0-9A-Za-z#$%=@!{},`~&()'<>?.:;_|^/+\t\r\n[]“”“”-]

On the third row there is Degree symbol which is not a keyboard character since it is not selected by the pattern.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.