Remove character in string that is between curly braces

I have a string from a .csv file like this:

ABCDE;12345;null;{myname:Sam;};somecontent
XYZ;69;null;{other   value };someothercontent

The delimiter of the csv is a semicolon, but it may happen that there is an unwanted semicolon in the text that is between the curly braces.
The unwanted semicolon can be present in different contexts, it can be a typo by input, a piece of inline css, a part of html,…

The good news is that the unwanted semicolons only are present in the content between the curly braces.
So the solution I’m searching for should be something like a Regex that executes “remove all semicolons that are between curly braces”.

Can anyone help me with setting up this pattern?

I think you can’t do this in one step.

var myText = @"ABCDE;12345;null;{myname:Sam;};somecontent
XYZ; 69; null;
{ other; value }; someothercontent";

var semiInCurly = "{.*;.*}";

var semiCurlyMatches = Regex.Matches(myText, semiInCurly);

foreach (Match semVal in semiCurlyMatches)
{
    myText = myText.Replace(semVal.Value, semVal.Value.Replace(";", ""));
}

Result

ABCDE;12345;null;{myname:Sam};somecontent
XYZ; 69; null;
{ other value }; someothercontent

Match Collection
|Index|Length|Value|
—|—|—|—|—|—|
|17|13|{myname:Sam;}|
|60|16|{ other; value }|

@s.verdyck
welcome to the forum

give a try on regex and use a regex replace:
grafik

Also have a look here:

4 Likes

Oh, I thought quantifiers in lookahead and lookbehind worked only with JavaScript. Learned something new. Thanks!

This pattern will only work when a semicolon is within in a curly brace and is immediately followed by a closing curly brace. It may be a typo you made.

For example, if the text is

ABCDE;12345;null;{myname:Sam;};somecontent
XYZ; 69; null;
{ other; value }; someothercontent"

the semicolon inside { other; value } won’t be identified.

image

This can be solved if we move the .* before the closing curly braces. i.e. (?<={.*);(?=.*})

image

Taking this bit further for academic purpose.

However, that pattern also is not correct. Because it will identify all semicolons followed by a closing curly brace in a line even if it is not inside. See the screenshot below.

image

Maybe something like this would do:

(?<={[^{}]*);(?=[^{}]*})

image

2 Likes

Thanks, this is very usefull.
After executing on my production data, I discovered that there are sometimes pairs of curly braces inside curly braces, like this:
ABCDE;12345;null;{myname:Sam;{myothername:Sammy}};somecontent

With a small edit in your regex, this is matched as well:

(?<={[^{}]*);(?=[^{}]*[{}])

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.