How to Extract string between two strings?

I have a regex below that extract data between two strings , but I also want to exclude some text in between for example if there is a text “helloworld” in between two strings then I wanna exclude it. Any idea thanks

System.Text.RegularExpressions.Regex.Match(strInput,"(?<=In the Last 120 Days)([\S\s]*)(?= History)").Value.Trim

1 Like

Refer this link:

1 Like

Hi @Jelrey,

After extracted the data use replace function to remove those strings

data.Replace("helloworld",string.Empty)

1 Like

How do we make the regex consider the first text found only ? , for example there are multiple “History” text , I want to only consider the first “History”

1 Like

between In the Last 120 Days and first “History” text found and will ignore other “History” text

1 Like

@Jelrey try this below code

rgxPattern = New Regex(Regex.Escape("History"))
rgxPattern.Replace(data, "", 1)
1 Like

how would it be integrated with my code

System.Text.RegularExpressions.Regex.Match(strInput,"(?<=In the Last 120 Days)([\S\s]*)(?= History)").Value.Trim

1 Like

@jelrey - rather than making your regex more complicated, I would just do additional processing based on matches you’ve found. If i understand correctly, you want to get the first match that doesn’t contain the text “helloworld”.

In order to do that use the ‘matches’ activity, or use a an assign activity to slightly alter your existing regex from a .match() to a .matches() statement. This will give you a variable of type ienumerable<match> which i’ll call MyMatches

Check to make sure you got at least one loop with a quick if statement: If MyMatches.Count = 0 Then (insert code here to handle this error)

Use a for each loop to iterate through the matches. Make sure to change the TypeArgument to System.Text.RegularExpressions.Match

For each item in MyMatches
If item.value.contains(“helloworld”)
Then Continue
Else Assign TextYouWantExtracted = item.value
Break
End If

Now you have a string variable called TextYouWantExtracted that contains the value pulled using your regex that doesn’t contain the word “helloworld”

1 Like

if the sample text is this

"In the Last 120 Days Something happened in History, and other tzhings were cool in history too and In the Last 120 Days and history stuf… and this is History test 2312dsad ytrrytryrt History "

the output should be “Something happened in”

since “Something happened in” is between start “In the Last 120 Days” and end “History” which is the first History text found and the rest will be ignored.

1 Like

if the sample text is this

"In the Last 120 Days Something happened in History, and other tzhings were cool in history too and In the Last 120 Days and history stuf… and this is History test 2312dsad ytrrytryrt History "

the output should be “Something happened in” …

since “Something happened in” is between start “In the Last 120 Days” and end “History” which is the first History text found and the rest will be ignored.

1 Like

@Jelrey I apologize I didn’t realize it was your regex that you were having issues with. Thank you for providing the sample input and expected output. That helps a ton when trying to give helpful advice.

Change your regex pattern to be this instead: (?<=In the Last 120 Days)[\S\s]*?(?= History)

I removed the excess parenthesis you had & added a non-greedy operator ?. Use this new regex pattern combined with my answer above to get your preferred text extracted

EDIT: You will also want to trim your match.value to remove the excess space at the beginning that you are grabbing. Or else you can alter the positive lookbehind to to include that whitespace so it is (?<=In the Last 120 Days\s+) instead

2 Likes

can you please post the full regex?

1 Like

I did post the full regex - it is: (?<=In the Last 120 Days\s+)[\S\s]*?(?= History)

Here is a sample .xaml showing you how it works: Jelrey.xaml (8.5 KB)

EDIT: Note that my matches activity is using the .IgnoreCase() option. Feel free to change that if you want your matches to be case sensitive

1 Like

Thank you for the effort and Idea sir , for the help. Appreciated.

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.