Supplier : Deliver To : Invoice To :
ABC House DEF Manor Department 1
123 Street 456 Road Building Name
ATown BCity 789 Terrace
ARegion EF3 4GH Another Street
AB1 2CD BRegion
IJ5 6KL
Trying to extract the Deliver Address from the following extracted text using the Document Understanding activity, using the Regex Based Extractor.
As you can see, the issue I am having is that the delivery address is sandwiched between the supplier and invoice addresses asx well.
Any experts in Regex got an idea on how I can extract the required data from this text sample?
Hi @LewisHenderson …Since you mentioned Document Understaing…Address can be easily extracted using Form Extractor or Intelligent Form extractor by selecting that area . I use regex based extractor for Set Values mostly.
I still can extract the values you are looking for, Just one question …i belive you have created 3 or 4 variables for this address right? Addresss Line 1 Line 2 City State and Zip ?
Thanks for responding, due to licensing, we are prioritising using Regex over other methods at this time.
Currently in my extraction there is a single address variable, as I have selected from the taxonomy that the Address is (for most cases) is an Addresss. So it attempts to break a single variable in the extraction. But you are right, I could extract each line of the address separately and build the address back up
Hi @LewisHenderson … Here you go…you can use in the Regex Based extractor as shown below…Do not Click on Edit just add the expression when you open the Regex based extractor, it will crash you Studio since there is a bug…
Note: I have tried my best to make it Generic, But still I am not positive that this same regex will work on other invoices, because address lines have lot of variations, like Suite # Apt # etc etc…Plus any change in Supplier Address will also impact this…