How to split a string based on Regex. But keep the "splitter"

as the title states. I need to split a string based on regex. but the split function removes the “splitter”. I need to keep that.

Example:
String_Input: “0001 123456 text 0002 987654 text 0003 1236589 text”

Assign_Value: system.text.RegularExpressions.Regex.Split(String_Input,“\n\d{4}\sd{6}”,System.Text.RegularExpressions.RegexOptions.Compiled).ToArray()

Current output:
Array_Ouput: “text”, “text”, “text”

Needed output:
Array_Ouput: “0001 123456 text”, “0002 987654 text”," 0003 1236589 text"
or
Array_Ouput: “123456 text”," 987654 text", “1236589 text”

I feel like this has to be possible, but don’t know how. Can someone help me?

Hi @jreintjes

Try this

System.Text.RegularExpressions.Regex.Split(String_input, "(?<=text) ")

or

System.Text.RegularExpressions.Regex.Split(String_input, "\b\d{4}\b ").Skip(1)

or

System.Text.RegularExpressions.Regex.Split(String_input, "(?<=text) ").Select(Function (s) s.Split({" "}, 2,StringSplitOptions.None).Last).ToArray

you can do it with a match and creating the array with following
Regex.Matches(strText, strPattern).Cast(of Match).Select(Function (m) m.Value).toArray

ensure that the pattern also include the text part

grafik

An Alternate strategy can be to use the fully pattern for a replace and replace with match and dedicated mark. Split can be don afterwards on the mark. This can help in case of unknown /variying parts are to handle:
grafik

1 Like

Just add the splitter back on with an Assign statement.

1 Like

Yes, a match is basically a list. so this would work… unless the text is not structured so I don’t see this working as of right now

yeah I guess I just have to do this. It’s not even a bad idea since I can easily add it back by using a Matches activity with the same Regex Patern

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.