I’m reading a PDF document to find a number that could be 6, 8 or 9 characters long. The way I’ve done it is it’ll use Regex to find if there’s a number 8 characters long, if not it’ll look for a 9 digit one, then 6:
The result kept coming up as blank so I used write lines to see which path it was following. Evidently, it’s thinking there’s an 8 character match but then the result it write back is blank. In the PDF, there’s a 9 character match, not an 8.
Am I doing something wrong? I don’t understand why, when there’s a 9 character match, it’s thinking there’s an 8 one but then the result is blank?
If you have a 9 digit string present, then your first Regex to find your 8 digits will return first 8 digits of that 9 digit string.
That in turn causes your IF condition to evaluate as False, because there are matches found.
This ends up with your IF going to Else.
I have a question first, because it can all be simplified if the number occurs only once.
For example, this will find either 6 digits, 8 digits or 9 digits and it will return it as a single match:
A match might not always be found though, as sometimes the PDF won’t have the correct number on there so I’d still need an IF statement thrown in there.
Also, the match (if in the PDF) will always be formatted like this - XXX/XX00/NUMBERHERE/0000000 - would there be a way to narrow down the search so it looks for the number in between /'s?
One tiny thing though that I cannot work out, XXX/XX00/NUMBERHERE/123456 - the number it’s looking for is “NUMBERHERE”, how would I get it to look between / and / rather than just one?