Problems with OCR - Now trying to compare two words char by char


Long story short, I’m trying to read pdf files that only contains pictures, read PDF doesn’t give an output, so I have tried to use OCR, both Tesseract, Microsoft and ended up using Google Vision OCR as it gives me the best result. But when scanning car VIN numbers 0 usually is recognized as O and vice versa.

So I figured if I compared the two VIN string character by character I could end up having 14/15 or maybe 16 characters that are placed the same place in both string, giving me a likely match high enough to proceed.

I’m triyng to compare these two strings


As the format always is a combination of letters and numbers but in very different order I guess that OCR will alwys be somewhat inaccurate.

So I tried doing this

Convert the two strings into Arrays
Doing a for each on string 1 looping through the letters one by one
and doing an IF the match the array element of the second string +1 to match, and finally I check Match against the length of the first string, giving me an % match.

But my loop through Array 2 isn’t behaving as expected. StringArr(int iteration) is not working but StringArr(0) StringArr(1) StringArr(2) works fine.


Do you have any exception such as “Index was outside the bounds of the array”?


No, I put in the two Write lines to check the output of the two array items, first array is looped through fine, but the second is always returning StringArr(0) as the output.


Can you check scope of ArrayIndex at variables panel?
Probably you need to widen its scope (not body but outer sequence)


You are the man :slight_smile: That did the trick, I changed the scope to be the entire sequence instead of body, now it’s applied to the IF too.

Thanks for the help.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.