Regular Expression for Inconsistent Multiple Spaces

Good day fellow UiPath RPA Developers,

Does anyone know how to parse a string with inconsistent number of spaces using hard coded Regex(like Regex.Replace-code because I will apply it inside a LINQ)?
These are the strings I want to parse

  • “MCGRAB PH - TRANSPORT PH|249385938291233455345500227170398130320220709090010,12345678901234567890123456789012345678901234567890”
  • “MCGRAB PH - GRABPAY/TOPUP SG|249385938291233455345500559127015506520220709090038,12345678901234567890123456789012345678901234567891”
  • “MCXSOLLA USA, INC US|249385938291233455345500559128814806020220709090210,12345678901234567890123456789012345678901234567892”

This is the text file, multiple spaces do not show here in forums
Sample Text.txt (439 Bytes)

If you will open the text file,

  • First string has 19 spaces after the MC phrase
  • Second string has 15 spaces after the MC phrase
  • Third string has 24 spaces after the MC phrase

As you will notice, the three(3) strings have different number of spaces

I’m trying to parse the MC phrase and these are the expected outputs:

  • MCGRAB PH - TRANSPORT
  • MCGRAB PH - GRABPAY/TOPUP
  • MCXSOLLA USA, INC

Is there anyone who can help me how to apply this in hard coded regex?

Any help would be appreciated. Thank you in advance!

Best Regards,
Anthony

Hi,

Can you try the following?

System.Text.RegularExpressions.Regex.Matches(strData,"^.*?(?=\s{2,})",System.Text.RegularExpressions.RegexOptions.Multiline)

Sample20220815-3.zip (2.6 KB)

Regards,

1 Like

Working with the expected output

Is it also possible if you can handle this scenario?
Sample Text 2.txt (1.1 KB)

If you will open the text file, there are strings before the MC phrases.
The expected output is still the same from the first ones.

  • MCGRAB PH - TRANSPORT
  • MCGRAB PH - GRABPAY/TOPUP
  • MCXSOLLA USA, INC

Thank you so much sir @Yoichi

Hi,

If your target string always starts with “MC” (and just after comma), the following expression will work.

System.Text.RegularExpressions.Regex.Matches(strData,"(?<=,)MC.*?(?=\s{2,})",System.Text.RegularExpressions.RegexOptions.Multiline)

Sample20220815-3v2.zip (3.2 KB)

Regards,

1 Like

You are a legend sir @Yoichi !

Thank you so much, your solutions are working perfectly!

Regards,
Anthony

1 Like

I have one more scenario sir @Yoichi , sorry for the trouble that I was careless to missed one of the business requirements.

Instead of starting MC, the correct output is getting the phrase after the 5th comma and before the multiple spaces.

This is the text file:
Sample Text 3.txt (691 Bytes)

Expected output:

  • MCGRAB PH - TRANSPORT
  • MCGRAB PH - GRABPAY/TOPUP
  • MCXSOLLA USA, INC
  • PAYPAL

Your expertise and help would be deeply appreciated!

Best Regards,
Anthony

Hi,

If there is always CN number (like CN0000375190XXXXX7346) just before your target string, the following will work.

System.Text.RegularExpressions.Regex.Matches(strData,"(?<=CN\w+,).*?(?=\s{2,})",System.Text.RegularExpressions.RegexOptions.Multiline)

Regards,

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.