Additional RegEx Help

I have the following sample text, I am trying to pull the Vendor [name on line after INVOICE], the NC#, the NCDA date, Number after Customer ID line, the text after that and the second dollar amount.

I started with INVOICE\s+(?.*\S) but it does not grab the full vendor name

INVOICE
Intuit [Quickbooks]
N C # 1000146489469
N C DA : 05/13/2020
2800 E. Commerce Center Place,
Tucson, Arizona 85706
Phone:
B
Elroy Taulton

Taulton Enterprise, LLC
1910 Bertram Dr
Mansfield, Texas 76063
817-900-6881
elroy.taulton@gmail.com
Customer ID: 9130348287658756
1

QuickBooks Online
Essentials
$40.00 $40.00
SUBTOTAL
DISCOUNT
$40.00
($28.00)
NET 30
TAX $0.79
6/15/2020
$12.79
Make All Checks Payable To: Intuit [Quickbooks]
THANK YOU FOR YOUR BUSINESS!

INVOICE
CAMERA READY Studios
N C # 12001
N C DA : 05/01/2020
North Dallas Rental
Dallas, Texas 75206
Phone: 214-390-7690
B
Elroy Taulton

Taulton Enterprise, LLC
1910 Bertram Dr
Mansfield, Texas 76063
817-900-6881
elroy.taulton@gmail.com
Customer ID: 9130348287658756
4

Studio Hourly
Rental
$30.00 $120.00
SUBTOTAL
DISCOUNT
$120.00

NET 15
TAX $0.00
5/16/2020
$120.00
Make All Checks Payable To: Camera Ready Studios
THANK YOU FOR YOUR BUSINESS!

Try this piece of regex for the invoice:
(?<=INVOICE\s)(.*\S)
View it on Regex101.com with this link

N C #:
(?<=N C # )(.*)
Regex 101 Link

NCDA:
(?<=N C DA : )\d+/\d+/20\d+
Regex 101 link

Customer ID:
(?<=Customer ID: )\d+
Regex 101 link

The number on the new line after the customer ID:
(?<=Customer ID: )(\d+\n)(\d+)
Regex101 Link
Note: To get the variable you will need to use ‘Match 2’.
In an assign activity: use this, INSERTVARIABLE(0).Groups(2).ToString
Replace the characters with your output variable.

Not 100% sure what you mean by “the text after that”.
The expected output/s would be helpful.

The second dollar amount…not 100% sure but this might be it:
$\d+.\d+\n(?=SUBTOTAL)
Regex101 Link

Image of result:
image

1 Like

Thanks Steven.

That did not give the desired results. I have adjusted the initial string to the following:
INVOICE\s+(?.*\s\S[^\n])

Now working on next group that will bring the number after N C #

INVOICE\s+(?<vendor>.*\s\S[^\n])

Is this not what you needed? If not, please provide an expected output sample for each of your requests.

image

image

Yes, but I am pretty sure UiPath doesn’t interpret PHP regex the same. I think it uses a flavor of .NET which have some differences. I could be wrong, but I believe this is correct.

I believe UiPath is built on .Net framework yes. The regex101 patterns reference the flavor as PHP which always work for me in the UiPath environment, maybe its something else?

image

I don’t know enough sorry to respond fully. But it would explain differences.

I’m sure someone else who knows more can answer this question…

Did find one issue that was causing some problems. Have been able to grab to first two pieces of information with
INVOICE\s+(?<vendor>.\s\S[^\n]).(?<=N C # )(?<invoice>\d+)

I didn’t sorry.

Are you using Regex within UiPath?

I would try cleaning your string first. There might be some invisible characters or atleast UiPath thinks there is.

To clean your characters, use an Assign activity with:
System.Text.RegularExpressions.Regex.Replace(INSERTVARIABLE, “[^a-z A-Z 0-9]”, “”)

THis will replace all characters with nothing EXCEPT characters in the range of: a-z or A-Z or 0-9. Modify this as needed.

Did cleaning your variable work?

I have made a mega post and think you will find it helpful…

Regex help tutorial MEGAPOST – Making your first Regex post, Reusable Regex Patterns, Regex Troubleshooting and more