Regex-([A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6} and Fixed Label as anchor

The above regex matches the following:

Invoice No: 12456

I am working with PDF Invoices that do not have a fixed position for Invoice No. It can be top or bottom, L/R. “Invoice” is a fixed label and I want to use that to locate the #, but I do not want "Invoice to appear in the results, just the #. any guidance would be appreciated. thanks

how about using replace activity for results
Input : results
Pattern : "Invoice No: "
Replacement : “”

Thank you. I will give it a try.

Input= Entire Scanned PDF?
Pattern to Match=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6} "Invoice No. 12456

replace activity is used for replacing text strings
so data type of input is string

Input : “Invoice No: 12456”
Pattern : “Invoice No:”
Replacement : (Blank)

Output is “12456”

official guide is here
you know you can use regular expression in this activity
so if invoice no contains only number, you set:
Pattern : ^[0-9]


Can you try the following?



Does this get entered into the replace activity?

Would this remove the “Invoice No.” string and leave the number?


You can directly get result as the following.

strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6}").Value


Does this get entered into an Assign activity?



Yes, as the following image.


If your anchor is fixed as “Invoice No” , perhaps you should use the following.
strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=Invoice\sNo:\s)[0-9]{6}").Value


No luck. Not passing data. Writeline is blank.

It is a difficult one, i have been working this all day.



How about the following? (added ignorecase option)

strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6}", RegexOptions.IgnoreCase).Value


strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=Invoice\sNo:\s)[0-9]{6}" , RegexOptions.IgnoreCase).Value


1 Like

Try this

(?<=(Invoice No: ))(.*)


Hi @jmcentee1488

Try this in Assign activity and assign it to a string variable.

System.Text.RegularExpressions.Regex.Match(Str,"(?<=Invoice No:\s)\d+").ToString

For your understanding i have attached (1.5 KB)

1 Like

you da man…It worked…I can now go to sleep:)

thanks much.


strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6}", RegexOptions.IgnoreCase).Value

The above regex works fine when there is no text after the end of the string. It does not work if there is text after the string. I do not want that text, How can I disregard any text following the regex?


It seems strange. The above regex returns only 6 digit number even if some string exists after it.

Can you share your string data if possible? It’s also OK if dummy data which reproduce it.

There might be other cause.


This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.