Replace some chinese or japanese characters which is in description column

Hello All,

I downloading data from SAP and uploading data into Oracle database, though I was successful however when I am trying to read the data in Power Bi I am getting an error

ORA-29275: partial multibyte character

Upon checking I saw in one of the columns the description contains non English description looks like chinese or Japanese characters, how can I replace these with null values.

二极管
TYRES
集成电路
多元件集成电路

Though the description column CAN contain the following characters.

**a-z A-Z 0-9 !@#$%^&()-_=+[]{};:',.<>/?`~*

I tried the following in a for each row in a data table and then an Assign activity, however I was not successful.

System.text.RegularExpressions.Regex.Replace(CurrentRow(“Description”).ToString,“^[a-z A-Z 0-9 !@#$%^&*()-_=+{};:',.<>/?`~]”,“”)

Appreciate your feedback for the same.

Regards,
Manjesh

Hi,
You can try below query

System.Text.RegularExpressions.Regex.Replace(CurrentRow(“Description”).ToString, “[^a-zA-Z0-9!@#$%^&*()-_=+{};:',.<>/?`~\p{IsBasicLatin}\p{IsLatin-1Supplement}]”, “”)

1 Like

Hi,

Do you have any error in the above expression? Or still have ORA-29275?
At least, as there are some special regex characters such as “-”, “[” in the above pattern, it’s necessary to escape as the following.

System.text.RegularExpressions.Regex.Replace(CurrentRow(“Description”).ToString,"^[ a-zA-Z0-9!@#$%^&*()\-_=+\[\]{};:',.<>/?`~]","")

Regards,

1 Like

Hi,

I hope this may help you.
Main.xaml (6.8 KB)

Regards,
Mounika

1 Like

It looks like you’re trying to replace non-English characters in the “Description” column with null values. Your attempt with regular expressions seems to be on the right track, but there are a couple of adjustments needed. The regular expression you provided is designed to match only the characters specified in the square brackets, so it would remove any character that is not in that list.

Here’s an updated version of your regular expression to remove non-English characters:

System.Text.RegularExpressions.Regex.Replace(CurrentRow(“Description”).ToString, “[^a-zA-Z0-9 !@#$%^&*()-_=+{};:',.<>/?`~]”, “”)

Changes made:

  • Removed the space between “a-z” and “A-Z” to include all letters in one character class.
  • Added the caret (^) inside the square brackets to negate the character class, meaning it will match any character not in the specified list.

This regular expression should replace any character in the “Description” column that is not a letter, number, or the specified special characters with an empty string.

Please note that this will remove all non-English characters, so make sure it aligns with your requirements. If you want to replace non-English characters with null values specifically, you can modify the expression accordingly.


  1. a-z A-Z 0-9 !@#$%^&*()-_=+{};:',.<>/?`~ ↩︎

dear @Mariemily_Silva ,

Thank you very much, it worked as expected.

Regards,
Manjesh

@Yoichi

I was able to detect the actual root cause, there was another column where there was extra spaces after & before the text which was causing the issue.

Thank you for the support as always.

Regards,
Manjesh

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.