Regular Expression explanation

Hi,
Can someone please help me understand what the regular expression doing after (?<=Paid Amount)\s*

“(?<=Paid Amount)\s*((?:[-$, ]+[\d,]+(?:.\d{2})?\s*)+)”

Thank you,

@A_Learner

  1. (?<=Paid Amount): This is a positive lookbehind assertion, which means it checks if the text
    preceding the current position matches “Paid Amount” without including it in the match.
  2. \s*: This matches any whitespace characters (spaces, tabs, line breaks) zero or more times.
  3. ((?:[-$, ]+[\d,]+(?:.\d{2})?\s*)+): This is a capturing group that matches the paid amount pattern. The pattern allows for different formats of a paid amount.
  4. (?:...): This is a non-capturing group, used for grouping but not capturing the content.
  5. [-$, ]+: This matches one or more occurrences of characters that can be a hyphen, dollar
    sign, or space.
  6. [\d,]+: This matches one or more occurrences of digits or commas.
  7. (?:.\d{2})?: This matches an optional group for a decimal point followed by exactly two digits
    (cents). The . is not properly escaped, it should be \..
  8. \s*: This matches any whitespace characters zero or more times.

Hope it helps!!

@A_Learner

The regular expression (?<=Paid Amount)\s*((?:[-$, ]+[\d,]+(?:.\d{2})?\s*)+) is used to find and capture paid amounts in a text string that follow the words “Paid Amount.” It allows for different formats, such as currency symbols, commas for thousands, and optional decimal cents. The matched paid amounts are stored for further processing.

  1. (?<=Paid Amount): This is a positive lookbehind assertion, ensuring that the matched pattern must be preceded by the words “Paid Amount” without including “Paid Amount” in the match.
  2. \s*: This part matches any number of whitespace characters (spaces, tabs, etc.).
  3. ((?:[-$, ]+[\d,]+(?:.\d{2})?\s*)+): This is a capturing group that matches the paid amount pattern. The pattern allows for different formats of a paid amount.
  • (?:...): This is a non-capturing group used for grouping the different options for the paid amount format.
  • [-$, ]+: This matches one or more occurrences of characters ‘-’, ‘$’, ‘,’, or space. It allows for symbols used in representing currency amounts.
  • [\d,]+: This matches one or more occurrences of digits and commas. It matches the integer part of the paid amount, allowing for commas as thousand separators.
  • (?:.\d{2})?: This is an optional non-capturing group that matches a decimal point followed by exactly two digits. It allows for the decimal part of the paid amount (cents).
  • \s*: This matches any number of whitespace characters (spaces, tabs, etc.) after the paid amount.
1 Like

Thank you so much @vrdabberu

1 Like

@A_Learner

If you find the solution please do mark as solution to close theloop.

Happy Automation!!

Thank you. Both are good explanations.
I marked as solution the first answer.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.