I am new to UiPath / Regular Expressions. If anybody can help, would highly appreciate.
Customers send their orders in pdf which I am converting to text and trying to extract the PO Numbers…all customers have their own way of sending the PO Numbers.
Below is my RegEx
(P.O. NUMBER|P/O Number|Purchase Order No.|P.O. NUMBER|PURCHASE ORDER NO.|PURCHASE ORDER NO|PO Number|Purchase Order|P.O.|PO\W)(?!.Box|BOX)(?!.Total|TOTAL)(\s?#?:?\s?)(.+)
1st Group – P.O. NUMBER|P/O Number etc. filters text starting with PO Number etc.
2nd & 3rd Group – (?!.Box|BOX)(?!.Total|TOTAL) - should not select PO Box or PO Total
4th Group - (\s?#?:?\s?) - Any special character after PO Number like PO Number # or PO Number :
5th Group - (.+) - Anything after the special characters, actual PO Numbers - 1234
Issue is - this Regular Expression is filtering few words - PROX / PADUCAH / PAOUCAH
These words are present in pdf files but not sure why are these getting selected.
Thanks in advance if somebody can help.