Basics Of Regular Expressions [RegEx] - A String Manipulation Technique

How to use Regular Expressions [RegEx] in UiPath Activities?

Regular expressions are useful in extracting information from any text by searching for one or more matches of a specific search pattern.


Starting from 19.2.0 version of the UiPath.System.Activities, RegEx builder wizard was introduced to ease the RegEx integration. However, it is essential to know the characters used in the patterns to tweak them until it matches the required output.


Following are some of the operators / characters used to frame a RegEx pattern:

image.png



Using Regular Expressions in UiPath Activities

Following are the activities available in the UiPath Studio where RegEx can be used to manipulate the string and extract the required information:

Activities that are part of UiPath.System.Activities:

  1. IsMatch - Indicates as a boolean whether the specified regular expression finds a match in the specified input string using the specified matching options.

  2. Matches - Searches an input string for all occurrences of a regular expression and returns all the successful matches.

Activity that is part of UiPath.IntelligentOCR.Activities:

  1. RegEx based Extractor - Extract relevant information from a document based on the configured RegEx pattern. Useful in case of Document understanding capabilities.


Activity Properties

  • Pattern - The regular expression pattern framed from using the quantifiers, assertions, special characters like (\s, \d, \w) etc.., to match.

  • Input - The string to be searched for matches.

  • RegexOption - A bitwise combination of the enumeration values that specify options for matching which is documented here in MSDN. By default IgnoreCase and compiled is checked.

Examples

  1. Extract only the digits from the string

  • Input string: My phone number is 123456789
  • Expected output: Extract only the numbers in the given input string i.e 123456789
  • RegEx pattern: \d+ or [0-9]+
  • Explanation
    • \d denotes the digits

    • [] used to mention the range i.e 0 to 9

    • + denotes one or more match.

  1. Extract a string between two words

  • Input string: I am working at UiPath, that is a leader in the RPA space.
  • Expected output: Extract the company name in the given input string i.e UiPath
  • RegEx pattern: (?<=(working in))(.*)(?=(, that is))
  • Explanation
    • (?<=(working at)) helps to look behind the word "working at"

    • (.*) is used to match any character and values other than newline character

    • (?=(, that is)) helps to look ahead the word ", that is" .