I am receiving a system-generated PDF and need to pull out 2 different numbers in a row based only when the report says they are not an error. I think the best way is likely to use 2 regular expressions - 1 to pull out number1 and another to pull out number2.
I am struggling to find a regex solution that does the following:
6056810 P 11282017 11272017 00000000 $ 666.26 POLICY MASTER SUPPLEMENT (AP012) MIN-DIST-INDICATOR DOES NOT = 'Y' REFERENCE DATE: 11/28/2017
7061977 P 11282017 11272017 00000000 $ 2,265.09
CV1 AMOUNT TO LOW TO PROCESS MINIMUM DISRTIBUTION
314283 P 11282017 11272017 00000000 $ 3,003.40 REFERENCE DATE: 11/28/2017
Match(0) = 7061977
Match(1) = 314283
The words “CV1 Amount…” or “Reference Date” will always follow the amount, so I am hoping to use that as the anchor and look back the correct number of words to pull out only the numbers I’m looking for. I was able to do this to pull out the $ amount by simply using a positive lookahead looking for those exact text. However, I’m not sure if it’s possible to look back (n) words? So I want to grab the word that is 7 words prior to the lookahead. Is that possible?