Regex to extract text in new line after a specific phrase

Hi
I have text in a txt file

Identified Outlier
AOJan.2002 0.346
AOJun.2002 0.58979
AAFeb.2002 0.58972

  1. I want to extract the 1st word (which is always 10 character length) from every new line below the “Identified Outlier”
  2. But I won’t know beforehand the number of new lines

So how can I do that?
Thank you

1 Like

Easy way to get value AOJAN.2002 ( First Word) Is to split the string and getting first index

First use
YourString.Split(Environment.NewLine.ToArray,StringSplitOptions.RemoveEmptyEntries)

Next Loop it and get the first word by
Line1.Split(" "c)(0)

2 Likes

is that format fixed?
AOJan.2002??
@Anonymous2

if the format is fixed you can try the below Pattern
[A-Za-z]{5}.\d{4}
click on the link to test your string!
@Anonymous2 regex101: build, test, and debug regex

1 Like

Hi,
Thank you for the reply. But I think there is some misunderstanding of my questions.

I am unable to tell how many new lines there will be below “Identified Outlier”

Identified Outlier

AOJan.2002 0.346
AOJun.2002 0.58979
AAFeb.2002 0.58972
.
.
.
“====”

It will finally end with a horizontal line across. Is there some way for me to find out the number of new lines between?
I will then extract the text from each of the line using
(System.Text.RegularExpressions.Regex.Match(Text1,“(?<=Identified Outliers.)\n\s.”).Value)
Thank you

you want to extract all the line in between Identified Outlier to ===??
right??

check this sample so you and copy the Pattern
(?<=Identified Outlier\s)(.|\n)*(?<=====)
cheers

@Anonymous2

1 Like

Yes. Then I want to count the lines, how to count?
Thank you

1 Like

get the RegexoutputVar.Count

is it working buddy?
@Anonymous2

@Anonymous2,
You can use the Split method to get the count of new lines.

Method 1: Use testVar.Split(System.Environment.NewLine.ToCharArray).Count.ToString to get count.

Method 2: Use the following custom activity.

Example:

Hi All

Sorry for the delayed reply. Many thanks for all the help. I used something similar to

(?<=Identified Outlier\s)(.|\n)*(?<=====)
to extract all the lines between " Identified Outlier" & “====”.

But I still can’t get the count of the number of lines. I used

testVar.Split(System.Environment.NewLine.ToCharArray).Count

But I get 63 lines when there is only 4-5 lines. So what is wrong?

Thank you

1 Like

Hi

It works when I do the split and count separately. Thanks

Hi,

I used the below to pass each line " Identified Outlier" & “====” into an array

testVar.Split(System.Environment.NewLine.ToCharArray)

But I get blank lines between each line of text. How to get rid of the items with blank lines in the array?

Thanks

Hi

Can someone help with the above pls?
When I split into newline, oddly I get new lines that are blank between each line of text.
Original Text:
AOJan.2002 0.346
AOJun.2002 0.58979
AAFeb.2002 0.58972

After splitting into newline
array (0)= AOJan.2002 0.346
array(1) = blank
array(2)=AOJun.2002 0.58979
array(3) = blank
array(4) =AAFeb.2002 0.58972

So how do I get rid of array(1) & (3)? Thank you

@Anonymous2, Try this:

testVar.Split(System.Environment.NewLine.ToCharArray, StringSplitOptions.RemoveEmptyEntries)

Cheers