RegEx syntax

Hello. I am having a problem extracting specific data from a PDF document.

The text in question is : Dataregnskabxxx-105(TFV)(KUNSLS)Seg2111202009/07/2020Sidenr.1

I want to take the 3 digits after Seg2, so I need RegEx to match it RIGHT after (KUNSLS), as there is another place in the document were Seg2 appears. So I need to specify, that it should match right after (KUNSLS).

I have tried this expression, but it is not working : (KUNSLS)Seg2(.){3}

This expression works: (?<=Seg2)(.){3} , but as mentioned earlier, it matches with another place in the PDF, so I need to specify the precise location.

Am I using a wrong expression? Thanks in advance

Kind regards

1 Like

Can you provide a full sample, the output and pattern of the text…

Maybe try this?

(?<=(KUNSLS)Seg\d)\d{3}

Or this:

(?<=(KUNSLS)Seg)\d{3}

2 Likes

Hello @YEB,

You can give a try with this pattern as well (?<=KUNSLS.Seg2).{3}

image

Cheers
@YEB

Hi,

Parentheses is a special character in regex. So you need to escape them as the following.

System.Text.RegularExpressions.Regex.Match(text,"(?<=\(KUNSLS\)Seg2)\d{3}").Value

Regards,

Thanks everyone, I will try the suggestions out, and see if it works :+1: