Sort highest IEnumerable RegexMatch

Hey everyone,

I only want the highest number(15000) out of all my regexmatches. I found out that it currently looks like this after the match result activity:

MatchCollection(3) { [700 200 100 >15000
], [500 >15000 200 100
], [8.600 600
] }

I guess i need to parse the three MatchCollections to one line.
To remove the > symbol i will later use a replace activity.
How do i go from here?

Greetings!

@uiStijn

So you need only one value 15000? or the whole of the value containing 15000(either the first or second match)?

cheers

if symbol “>” represents the highest number then why not to use it in regex pattern to get highest number.
Give us some sample input and output data.

Give a try at

Assign Activity:
myMax | Double=

(From m in yourMatchCollectionVar.Cast(Of Match)
From n in System.Text.RegularExpressions.Regex.Matches(m.Value,"\b[\d\.\,]+\b").Cast(Of Match)
Select i = CDbl(n.Value)).Max()

Hey Anil!

Yes only one number needs to remain. Only the highest, it does not matter if the number exist a first, second or third time.

Hey Akshay,

With Regex pattern: (?<=W57209.)[\d\W]+ i capture the digits after W57209.

PDF text:

  1. Q Legionella species kve/liter W57209 700 200 100 >15000
    analyse conform, bevestiging g.a. NEN-EN-ISO
    11731 mbv MALDI-Tof MS

  2. Q Legionella species kve/liter W57209 500 >15000 200 100
    analyse conform, bevestiging g.a. NEN-EN-ISO
    11731 mbv MALDI-Tof MS

  3. Q Legionella species kve/liter W57209 8.600 600
    analyse conform, bevestiging g.a. NEN-EN-ISO
    11731 mbv MALDI-Tof MS

The digits and numbers are always random and vary from 1-15000 or >15000

grafik
Code from above was updated and should work for the max retrieval

Hey Peter!
Thanks for helping i renamed yourMatchCollectionVar to my Regex output, but getting an error. See the screenshot: “Cast is not a member of System.Text.RegularExpressions.Match”

we will cross check you are using the Matches Acitvity and output variable name is LegionellaValueMatches?

Yes, sir

ok got it:

I corrected the typo above:

Hi @uiStijn

Can you try this

String.Join(",",matchcol.Select(Function(x) String.Join(",",System.Text.RegularExpressions.Regex.Matches(x.Value,"\d+\.{0,1}\d+").Select(Function(y) y.Value)))).Split(",").Max().ToString

This would directly give the max value in string type(if you want in double remove the .ToString at the end)

matchcol is the matches collection you already have

cheers

Hey Peter, We 've got an error!

as mentioned it was set to string

OR

Select i = CDbl(n.Value)).Max().toString

Hello @uiStijn
I would recommend you to stick with your pattern (?<=W57209.)[\d\W]+ as the other pattern are including ISO numbers as well (11731 etc.), which might return unexpected results:

image

You can do below
log message =
(from val in String.Join(" ",(from m in Regex.Matches(str_PDF_Data,"(?<=W57209.)[\d\W]+").Cast(of Match) select x = Regex.Replace(m.Value.Replace(">15000","15001"),"[^0-9. ]","")).ToList).Replace(" "," ").Split(" "c).ToList select x = CDbl(val)).Max

You can replace 15001 with anything that you want.

@uiStijn

did you happen to try this?

cheers

Hey Anil,

Thank you.
Yes i did, it gave 8600 as output, correct should be 15000.

@uiStijn

Sorry a small correction


System.Text.RegularExpressions.Regex.Matches(String.join(",",matchcol.Select(function(x) x.Value)),"\d+\.{0,1}\d+").Select(function(y) Cdbl(y.Value)).ToArray.Max()

This will give the max value(output is of type doubel use .ToString to convert)

cheers

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.