Need to build a regular expression

I have a requirement to scrape the data after user id keyword.
Below shown are the possible patterns to look for

  • user id - XXX XXX
  • user id- XXX XXX
  • user id XXX XXX
  • user id: XXX XXX
  • user id : XXX XXX
  • user id is XXX XXX

XXX - any word(s) that need to be scrapped.
P.S. Any text may or may not be available after the XXX XXX.

E.g.

  1. Here is the user id: Mani Prajwal
    Use the provided id to perform the required process.
  2. Please find the required user id Mani Prajwal
  3. Find the user id - Prajwal
    thanks n regards,
    xxxxxx

Here I have to extract Mani Prajwal/Prajwal because it came after the provided keyword.

Anyone help me to build a regex to match all the above-shown patterns.
I have tried building some regex patterns but failed to build a regex that completely fit for my requirement.

Hi,

Can you try the following expression?

mc = System.Text.RegularExpressions.Regex.Matches(text,"(?<=user id\s*(-||:|is)\s)[A-Z].*")

Regards,

1 Like

Hi @Yoichi,

Thanks for the insight! But it is not matching to my patterns. In fact, the pattern which you suggested is throwing some pattern error.
Please have a look over here regex101.com

  • I need to get the text after the keyword. The required text may be one or two words.
  • After the required words, we may have some other text as well so we should not get those text.

eg.

  1. user id: abcd
    Regards,
    xxx

Here we have to take the only abcd

  1. user id : abcd efgh
    Thanks n regards,
    xxx

It is another type, here we have to extract only abcd efgh

We shouldn’t try to scrape the remaining text.

Anyone, please help me I’m stuck here and been trying for an exact match pattern from a couple of days.

Hi @ManiPrajwal_K,

There is always a line break after the name?

If so a simple \n (new line) control should be sufficient

Else the problem i can see is in the fact that the rule is not clearly definable. All the Regex rules that would take abcd efgh in Case 2 would take abcd Regards in the first case.

Have you, in any way, the possibility of changing the behavior (i guess it’s an user request) changing the best practises? A simple point after the name could be enought to have a reliable Regex
A simple point after the Name is enough

Hey @Gabriele_Camilli… If possible could you just send me the regex pattern.

Hi,

How about the following expression?

System.Text.RegularExpressions.Regex.Matches(text,"(?<=user id\s*(-||:|is)\s)([A-Za-z]+\s){1,2}")

Sequence3.xaml (6.4 KB)

In fact, the pattern which you suggested is throwing some pattern error.
Please have a look over here regex101.com

regex101.com is not 100% compatible with .net regex.
Can you try the above xaml file?

Regards,

1 Like

Hi @ManiPrajwal_K ,

Try With this:

user id*(-|:|is|[[:blank:]])(.*)$

or

user id*(-|:|is|[[:blank:]])(.*)($|\n)

Note: This will work ONLY IF the name is the end of a line (e.g.

user id: abcd efg
Regards

will give you

abcd efg

While

user id: abcd efg, Regards

will give you

abcd efg, Regards

)

1 Like