Extracting house number from address including characters - 1A, 1B etc

Hi,

I have a task where I need to extract the house number from an address. The reason for extracting the next character as well is because the number may be something like 1a Example Road, 1b Example Road etc. I need to make sure I get the correct address every time.

I found a solution on the forums where I can use regex to extract any and all numbers from a string which works great, but it will not pick up the character after the numbers (if one exists).

Example Inputs:

234 Test Road, Testville
456b Tester Road, Testville

Required Outputs:

234
456b

I am open to any and all suggestions on how to do this, thanks in advance.

Use regex

\d+\w

Please remove Orchestrator tag, as it has nothing to do with the topic.

Thanks for reply, I have edited out the Orchestrator tag.

I am unfamiliar with regex, the code I am using is:

System.Text.RegularExpressions.Regex.Replace(Address,"\d+\w","")

When I use this with the variable ‘Address’ set as ‘1a Test Test’ I get ‘Test Test’ as an output. How can I get only ‘1a’?

I can use System.Text.RegularExpressions.Regex.Replace(Address,"\D","") to return ‘1’.

Use Matches activity to extract what you are looking for, not replacing https://activities.uipath.com/docs/matches

1 Like

@PeterHeyburn

Try below one:

Str = " 234 Test Road, Testville"

RequiredString = Str.Split(" ".TocharArray)(0)

The above expression will give you output as: 234

1 Like

@lakshman

Thank you, this works well when the house number is at the start of the string. I also have the case where there may be a house name at the start of the string e.g. ‘House Name, 1a Test Road’ - I need only 1a here. How can I do this?

1 Like

Thank you for this suggestion, I had never used the ‘Matches’ activity before but have managed to customise it so that I can return the correct address for quite a few scenarios.

The pattern I have ended up with is: “(\d+)(\w*)” so this looks for at least one digit and then following that digit, Any (0 or more) word characters.

The only issue I can see now is if the input is something like ‘1 a’ but I think I will just raise this as a business exception. Thanks again for your help.

Check this one, that also takes 1 a into account https://regex101.com/r/QFo4IV/1

\d+[ \w]{1,2} 

Note there is a space after } so it stops at space

1 Like

@PeterHeyburn try this

System.Text.RegularExpressions.Regex.Match(Add,"\d[0-9]+\w").ToString

P.S- add is my variable name

1 Like

@c.ciprian thanks, this works for the cases where the house number is e.g. 1a, 11a, 111a, 11, 111 but does not work for a single digit e.g. 1, 2, 3.

I will keep this regex in the case that I do need to work with an address in the format 11 a but I think for now I will stick with (\d+)(\w*).

I have corrected the pattern

image

2 Likes

@c.ciprian This works! Thank you very much.

I have never really used regex before, although I understand what it is. If you have time could you please explain exactly how your solution works? I know that \d searches for digits and \w for characters but that’s about it!

https://regex101.com/ gives you all the reference (bottom right) and allows you to test your expressions on your text.
Even if an expression is built for you by someone else, you paste it in here and this site explains it to you in a clear and concise manner (top right)

2 Likes

Very useful, thank you!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.