Hello all,
I have been experimenting with the regex based extractor inside component in the Intelligent OCR Package. I have run into a strange occurrence while using different regex patterns. I noticed that the Regex pattern would not be correctly found in certain cases where there is a group1. For example using
(\d|\.)*(?=(\s|\n)*PLEASE)
on
"143.05
PLEASE PAY FROM
CUSTOMER"
returns a full match of “143.05”
a group1 of “5”
and a group2 of “”
The regex based extractor for some reason always takes the group1 match, not just in this example but in the other expressions I have tried as well. I can work around this by changing my regex to include extra parentheses around the area I want to capture like such:
((\d|\.)*)(?=(\s|\n)*PLEASE)
But this feels more like a band-aid than a real fix. If this was in a matches activity, I could specify the exact group and match I wanted to make, but I did not see any option for this in the extractor. I am wondering if I missed something and there is someway to specify which group you wish to extract and also wondering if what I am running into is the intended functionality. Thank you all for any insights!