Regex logic to extract the text from backwards

Hi Everyone,

I need a regex expression to extract the delivery code from the email body.

Sample Text : Cousins Incorporated (52154852EQPU). Could you review

Expected Output: 52154852EQPU

Sample Text : Swed AB (Public) (52155286HGPU). Could you review

Expected Output: 52155286HGPU

Earlier I was using the regex to Extract the text between ( and ). Could but in second scenario it is failing.

Kindly suggest any regex that I can use to extract the text backwards.

Thanks in advance

@Binod_Kumar_Biswal

You can use a regular expression to extract the last occurrence of text between parentheses. Here’s a suitable regex pattern for your requirement:

\([^()]*\)(?!.*\([^()]*\))

This pattern captures the text within the last set of parentheses in the string.

Here’s a brief explanation of how it works:

\(: Matches the literal opening parenthesis.
[^()]*: Matches any character except for parentheses, zero or more times.
\): Matches the literal closing parenthesis.
(?!.*\([^()]*\)): Negative lookahead to ensure that there are no more sets of parentheses after the current match.

Hi,

Can you try RegexOption.RightToLeft as the following?

System.Text.RegularExpressions.Regex.Match(" Swed AB (Public) (52155286HGPU). Could you review","(?<=\()\w+(?=\))",System.Text.RegularExpressions.RegexOptions.RightToLeft).Value

Regards,

Hello @Binod_Kumar_Biswal

Take a look at this pattern:
(?<=\()[0-9A-Z]+(?=)\.)

Regex pattern explanation:
The pattern will extract the capital letters and numbers between ( and ).

If you want to incorporate ‘). Could’ as per your post then the pattern can be updated to this:
(?<=\()[0-9A-Z]+(?=)\. Could)

Hopefully this helps

Cheers

Steve

Hi @ashokkarale ,
Thanks for the response.

But in the second example this will extract “Public) ( 52155286HGPU” instead of “52155286HGPU” ?

Kindly let me know once

@Binod_Kumar_Biswal,

\(([^()]+)\)(?!.*\([^()]*\))

Hi @Binod_Kumar_Biswal ,

There are multiple regex you can use to solve the problem
Try one of the below and let me know how that works out

Option 1: Using a simple group within parentheses
((\d+[A-Z]+))

Explanation:
(: Matches an opening parenthesis.
\d+: Matches one or more digits.
[A-Z]+: Matches one or more uppercase letters.
): Matches a closing parenthesis.
(\d+[A-Z]+): Captures the delivery code within parentheses.

Option 2: More specific for delivery codes
((\d{8}[A-Z]{4}))

Explanation:
\d{8}: Matches exactly 8 digits.
[A-Z]{4}: Matches exactly 4 uppercase letters.
The rest matches as described in Option 1.

Option 3: Non-greedy match for content in parentheses
((\d+[A-Z]+?))

Explanation:
+?: Ensures the match is non-greedy, extracting the shortest possible match within parentheses.

Option 4: Match only codes that follow a pattern
((\d{8}[A-Z]{4}))

Explanation:
This approach enforces an exact structure for the code, assuming all codes follow the format of 8 digits followed by 4 letters.

Option 5: Flexible match for nested parentheses
([^()]((\d+[A-Z]+))[^()])

Explanation:
Handles cases where parentheses are nested (e.g., Swed AB (Public) (52155286HGPU)).
[^()]*: Matches any characters except parentheses to avoid confusion with nesting.

@Binod_Kumar_Biswal

Please try this which is more close to the pattern

[^(]+(?=\)\. Could you review)

Cheers

Hi @Binod_Kumar_Biswal

Try using this regex pattern: ([^()]((\d+[A-Z]+))[^()] )

Or this,

[^(]+(?=). Could you review)

Thanks!!