Extracting only numbers of specific length from a string

I need to extract a 5 digit number from text. Can you help me with the general regular expression for that?

Eg.:
Name: ABCXYZ
Customer ID: 12345
Tel No.: 1234-435-543

I need to get the highlighted one, “12345” as output

Use \d{5}.

5 Likes

Thanks!

Can you please also mention the regex if we need to extract a 5 digit string which has only numbers and alphabets?

Eg. 12A5F, 15AT5, 3789A

Can’t guarantee this will work in all cases (as it could also get 5-letter words), but you could use (\d|[A-z]){5}.

Hi @Anthony_Humphries

Thanks for the replies.

That doesn’t work as expected and as you said it is extracting words with only alphabets too.

Can you please give the expression to extract the following form of string using regex.

2020031ER0162
AV2020031NB0162

To extract something like that without extracting other words, you’d need to have more information about what’s around the identifier you’re trying to extract. It’s possible to pinpoint the value if there are set strings before and/or after this identifier, or if this identifier is always on the same line.

@Anthony_Humphries Thanks for the reply

The position of the part that needs to be extracted is not fixed and it can be at multiple places.

Can we not have two regex to extract such 2 strings?

You need to know the set patterns of the identifiers. If you know that much, then yes you can. For example, if you know the id always is #######LL#### or LL#######LL####, where # is a digit and L is a letter, then you can make a regex that captures either one. But if the length and letter/number positions is completely random, there isn’t a regex that will capture the identifier.

Got updates on this, criteria now is it should be of 10 digit and start with current year as first four digit. rest can be anything

2020031ER0162

and having these 4 character at 3rd, 4th, 5th and 6th position. Rest can be any number or string but lenggth will be 10

AV2020031NB0162

When you say “current year”, will that be this year, or is it possible that you will be dealing with any documents from, say, last year?

current year

In that case, try this regex: "(" + Now.Year.ToString + "([A-Z]|\d){6})|(([A-Z]|\d){2}" + Now.Year.ToString + "([A-Z]|\d){4})".

It combines the current year into the regex using Now, and the number of characters is determined here based on a 10-character code. It can be no more or less.

Thanks for all the help! Sorry for taking such a long time.

I tried that, if the string had below in it

2020031ER0162
AV2020031NB0162

then the output is:

2020031ER0162
AV2020031NB01

Expected output is:

2020031ER0162
AV2020031NB0162

Or if not, can this just be altered for getting only 2020031ER0162

It looks like the number of characters in your ids is 13, rather than 10. Try using "(" + Now.Year.ToString + "([A-Z]|\d){9})|(([A-Z]|\d){2}" + Now.Year.ToString + "([A-Z]|\d){7})" instead.

Sorry I just mentioned incorrectly in my early post.

Got updates on this, criteria now is it should be of 13 digit and start with current year as first four digit. rest can be anything just the length is 13 for this case

2020031ER0162

and
Case 2 is having these 4 character (of current year) at 3rd, 4th, 5th and 6th position. Rest can be any number or string but length will be 15 in this case

AV2020031NB0162

In that case, use "(" + Now.Year.ToString + "([A-Z]|\d){9})|(([A-Z]|\d){2}" + Now.Year.ToString + "([A-Z]|\d){9})".

Thanks a lot for the help. That is working as expected. :slightly_smiling_face:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.