Seek Help for Regex to Extract from PDF

Hi all, I would like to extract the Invoice number (SA-3077857) and the amount after “DDU MALAYSIA” (which is 2,686.80) from the attached pdf invoice and the text file from Read PDF Text activity.
I would appreciate help with the REGEX to extract the above. I am confused that the matches are different using the following at and Regex Builder (Uipath) using

  1. ^NO.: *(?[^\s]+)
  2. ^DDU MALAYSIA +(?[^\s]+)

These 2 expressions are not from me but help I got from #Regex as I am new to regex.
Any help will be much appreciated.

SM-3077857-1.pdf (69.6 KB)
Invoices.txt (1.1 KB)

hope thsi would solve your issue
image (16.1 KB)

1 Like

Hi Ashley,

Thank you! You have helped me get the first part - Invoice Number exactly what I needed.

  1. ^NO.: *(?[^\s]+)
  2. ^DDU MALAYSIA +(?[^\s]+)

I would appreciate any help on the second part
2. ^DDU MALAYSIA +(?[^\s]+)

to extract “2,686.80”.

Use the below regex to get the amount value:


Senthil V.

^DDU MALAYSIA\s+(?<amount>[\d,\.]+)


1 Like

If the PDF format is fixed,then you can also go for string split.

Hi msan, the regex

gave full matches when tested at However, when I implemented these regex in Uipath, no matches found.

Ashley11 has mentioned that Regex101 allows creation, debugging and testing forPHP, PCRE, Python, Golang and JavaScript. How different are these regex when used in UiPath (or Regex Builder)?


I use PCRE flavor on regex101 when testing patterns for UiPath. You have to pay attention to the options however as the default are not necesarily the same. I’ll use Multiline below.


Please consider the following example assignments (all strings, except amount as Decimal) and with MyText as the invoice content.

InvoicePattern = "^NO\.:\s+(?<invoice>.+?)\s*$"
AmountPattern = "^DDU MALAYSIA\s+(?<amount>[\d,\.]+)"

Invoice = System.Text.RegularExpressions.Regex.Match(MyText, InvoicePattern, RegexOptions.Multiline).Groups("invoice").ToString

Amount = Convert.ToDecimal(System.Text.RegularExpressions.Regex.Match(MyText, AmountPattern, RegexOptions.Multiline).Groups("amount"))

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.