ABCDE;12345;null;{myname:Sam;};somecontent
XYZ;69;null;{other value };someothercontent
The delimiter of the csv is a semicolon, but it may happen that there is an unwanted semicolon in the text that is between the curly braces.
The unwanted semicolon can be present in different contexts, it can be a typo by input, a piece of inline css, a part of html,…
The good news is that the unwanted semicolons only are present in the content between the curly braces.
So the solution I’m searching for should be something like a Regex that executes “remove all semicolons that are between curly braces”.
Oh, I thought quantifiers in lookahead and lookbehind worked only with JavaScript. Learned something new. Thanks!
This pattern will only work when a semicolon is within in a curly brace and is immediately followed by a closing curly brace. It may be a typo you made.
For example, if the text is
ABCDE;12345;null;{myname:Sam;};somecontent
XYZ; 69; null;
{ other; value }; someothercontent"
the semicolon inside { other; value } won’t be identified.
This can be solved if we move the .* before the closing curly braces. i.e. (?<={.*);(?=.*})
Taking this bit further for academic purpose.
However, that pattern also is not correct. Because it will identify all semicolons followed by a closing curly brace in a line even if it is not inside. See the screenshot below.
Thanks, this is very usefull.
After executing on my production data, I discovered that there are sometimes pairs of curly braces inside curly braces, like this:
ABCDE;12345;null;{myname:Sam;{myothername:Sammy}};somecontent
With a small edit in your regex, this is matched as well: