Regex-([A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6} and Fixed Label as anchor

The above regex matches the following:

Invoice No: 12456

I am working with PDF Invoices that do not have a fixed position for Invoice No. It can be top or bottom, L/R. “Invoice” is a fixed label and I want to use that to locate the #, but I do not want "Invoice to appear in the results, just the #. any guidance would be appreciated. thanks

how about using replace activity for results
Input : results
Pattern : "Invoice No: "
Replacement : “”

Thank you. I will give it a try.

Input= Entire Scanned PDF?
Pattern to Match=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6} "Invoice No. 12456
Replacement=?

replace activity is used for replacing text strings
so data type of input is string

Input : “Invoice No: 12456”
Pattern : “Invoice No:”
Replacement : (Blank)

Output is “12456”

official guide is here https://docs.uipath.com/activities/docs/replace
you know you can use regular expression in this activity
so if invoice no contains only number, you set:
Pattern : ^[0-9]

Hi,

Can you try the following?

System.Text.RegularExpressions.Regex.Match(s,"(?<=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6}").Value

Regards,

Does this get entered into the replace activity?

Would this remove the “Invoice No.” string and leave the number?

Hi,

You can directly get result as the following.

strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6}").Value

Regards,

Does this get entered into an Assign activity?

thanks

Hi,

Yes, as the following image.

img20191121-1

If your anchor is fixed as “Invoice No” , perhaps you should use the following.
strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=Invoice\sNo:\s)[0-9]{6}").Value

Regards,

No luck. Not passing data. Writeline is blank.

It is a difficult one, i have been working this all day.

thanks

Hi,

How about the following? (added ignorecase option)

strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6}", RegexOptions.IgnoreCase).Value

or

strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=Invoice\sNo:\s)[0-9]{6}" , RegexOptions.IgnoreCase).Value

Regards

1 Like

Try this

(?<=(Invoice No: ))(.*)

Thanks

Hi @jmcentee1488

Try this in Assign activity and assign it to a string variable.

System.Text.RegularExpressions.Regex.Match(Str,"(?<=Invoice No:\s)\d+").ToString

For your understanding i have attached workflow.Regex_Test.zip (1.5 KB)

1 Like

you da man…It worked…I can now go to sleep:)

thanks much.

2 Likes

strResult = System.Text.RegularExpressions.Regex.Match(YourEntireData,"(?<=[A-Z]{7}\s[A-Z]{2}:\s)[0-9]{6}", RegexOptions.IgnoreCase).Value

The above regex works fine when there is no text after the end of the string. It does not work if there is text after the string. I do not want that text, How can I disregard any text following the regex?

Hi,

It seems strange. The above regex returns only 6 digit number even if some string exists after it.

Can you share your string data if possible? It’s also OK if dummy data which reproduce it.

There might be other cause.

Regards,

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.