REGEX - extract words in Upper Case

Hello. I need to extract the bolded words in Upper case from the following text:

"- Nicușor Dan, cotat cu 42.03%

  • Gabriela Firea: pierde importante procente, având doar 36.8%
  • Traian Băsescu, surpriza săptămânii, cu 15,3%
  • Octavian Bădescu, intenție de vot 2,7%
  • C.P. Tăriceanu: 1.005%
  • Ilan Laufer, Ioan Sîrbu și Alexandru Coita, fiecare câte 0,7216% . Au multe rude, pare-se."

So far this is my best match, yet it is not the best. (other steps: split the text into lines, using a for each to check match)

System.Text.RegularExpressions.Regex.Match(item.ToString,"([A-Z]\p{L}\P{M}\p{M}[a-z]+\p{L}\P{M}\p{M})+").ToString

and my result(every line) is the following:
"Nicușo "
“Gabriela Firea”
“Traian Băsescu”
“Octavian Bădescu”
“Tăriceanu”
“Laufer”

www.regex101.com was used for testing.

I am sure that regex formula needs a little tweeking but I cant pinpoint it. New to RPA :slight_smile:
Thanks in advance.

Hi @Radu_Calin

Welcome to the UiPath Community

Try with this expression → ([A-Z]\w+\s[A-Z]\w+)|(([A-Z]\.)+\s*[A-Z]\w+)

Also, you do not need to Split the text into lines. Use it over whole text by Matches.

System.Text.RegularExpressions.Regex.Matches(text, "([A-Z]\w+\s[A-Z]\w+)|(([A-Z]\.)+\s*[A-Z]\w+)")

Matches gives a collection of Match.

You can also use the Matches activity available in UiPath

After that use For Each Activity to iterate through the matches

Extract Names.xaml (5.7 KB)

1 Like

Thank you!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.