Cut String using specific rules

Hello again! This is already my second post today! :smiley:

So i have many inputs of strings that look like this:

John Doe Blue street, no. 12, Blue building, first floor
Marry Jane White street, no 198, White building, second floor
Jill Mcfin Red avenue, no4, Green building, last floor
Lolly Poppy Magenta lane no. 8, Magenta building, backyard entrance pass the gate
.
.
.
and so on

After Uipath magic, i want that these strings to look like this:

Blue street, no. 12
White street, no 198
Red avenue, no4
Magenta lane no. 8

This result is achievable by extracting the text between " " (3 empty spaces) and the number of street (aka ### number). Does anyone has any ideas how this could be done?

Thanks a lot for taking the time to read this!

Vlad

Here it does not show, but in my file there is actually " " (3 empty spaces) between the name of the person and the street name, which can be used to identify the point of string when the cutting should start

Hi @VladM,
What does its job for me is the below sequence, where “addressString” is your address line, “addressSplit” is an array of strings which equals:
addressString.Split({" “}, StringSplitOptions.None)
and “outcome” is:
addressSplit(2) + " " + addressSplit(3) + " " +addressSplit(4) +” " +addressSplit(5).Replace(",", “”)

Let me know if this helps :slight_smile:

1 Like

hello @VladM
can you also post what you have tried so far…

Thanks @PAD, yeah i tried what you’ve said and it does work for this example, although what i need is something more general (for example you could have the name of the street of 3-4 words “Very blue with orange dots street” or 3-4 words for the person name “Ludovic Iulius Caesar the First” and so on.

So that’s why the condition for searching the street name should be a Split function that searches between “…” (those are 3 consecutive spaces, i used dots to represent them) and the first number string that appears (ie 264).

@AkshaySandhu this is what i’ve done till now. Main.xaml (13.5 KB)
It is a robot that takes as input some delivery information from a table from a website and then it compares some of the data with the data from a Clients excel file and gives as output another excel file with the final manipulated data.

I still haven’t found a way to make Uipath recognize consecutive spaces but i am thinking to use Regex. Also, i wonder if you could make Uipath search a string and split it after it finds a number (be it any number between the range 1-999) What do you think? Anyway, still looking for solutions :stuck_out_tongue:

you can use this regex “^(.+?)[0-9]*[0-9]” to get the text before first sequence of number(including it). But still you will be left with employee name and first line of address.
e.g. John Doe Blue street, no. 12 or John Johnson Doe Blue street, no. 12

If you are getting the just name of employees from somewhere else i.e from client file or from web site you are extracting data from, you can just replace the name with empty string.

1 Like

Hi @VladM,
Sorry to hear that your issue is more complicated.
I’d say that while it is spaces that are the delimiters everywhere, if you don’t know exactly how many words may the name and the address line consist of (it might be 3-4 words each), it could be hard to distinguish between the name and the address line even for a human being - e.g. in case you have “Mary Hazel White Chapel Street, no.14”, how can you tell whether this is “Mary Hazel White” living on “Chapel Street, no.14” or “Mary Hazel” living on “White Chapel Street, no.14”? :thinking: Perhaps there is something in that table from a website which would help you split the data? Any column names?
If you decide to play with regex, here is some regex tester and debugger website I find useful: https://regex101.com/
Sorry that I couldn’t help further.

1 Like

Thank you @AkshaySandhu, in the end I managed make a regex which extracts only the street address. Indeed, the final part of the regex for stoping after founding a number is similar to what you described.

@PAD thank you :), regex101.com is actually the website i used to test the sequence! Indeed, what you say in your example about “Mary Hazel or Mary Hazel White” is a true thing, but as I said in the first posts, in my file (ie the one from where i extract this data) has 3 consecutive spaces ("") between the end of the person name and beggining of the street name. So it was easy to use this particularity to give regex a point where to start and what @AkshaySandhu said to give regex a point where to end.

I managed to solve this and extract the street names with a 99% succes rate (1% is from some very special cases which i will just include them in the regex as “or”). Found out that regex is a very powerful tool also. Big thanks for the help! :slight_smile:

1 Like