Relative Screen scraping does not return correct string


#1

While trying to get text using relative screen scrape method, I am not getting the correct string. For e.g.-

In above image, I am trying to get value “moe001” but scraping method returns value “moeOOl”. Instead of 001 its considering 0 as O and 1 as l and returns value OOl. This relative screen scrape works in case if want to retrieve text “moerb499”.
Please post any suggestion.


#2

Hi,

(note: I don’t use or have installed Microsoft OCR, so available options might differ a little. Suggestion is based on GoogleOCR engine)

If your usernames consist only of lowercase letters and numbers, you could set AllowedCharacters to “abcdefghijklmnopqrstuvwxyz0123456789” (optionally with a combination of denied characters string for some special characters that are too similar to letters/numbers). This should help with the O->0 and |->1. Remaining issue might be then that | is changed to l (lowercase L). In that case, try changing the scale and if it doesn’t help I’d add a conditional to check and replace it for the last char, if you know it’s always a number.

If they have an even more consistent format, you could check with regex/string operations and replace accordingly. That would require some kind of a translation table, but should work also as a reusable component in the future, if you have other projects using OCR.

Completely alternative solution could be to take a screenshot instead of OCR for that region, save it to an image file and add a filter (f.e. using any open source image manipulation library) for that grey background. That would probably be overkill though so I wouldn’t suggest to go that route unless everything else fails.


#3

Hi,

I tried all other scales too, but for some words it works and for some it doesn’t. I tried using a logic to convert ‘O’ and ‘l’ with ‘0’ and ‘1’ respectively.

My logic worked and I got the value that I wanted. But now I am getting one more strange thing, i.e., relative scraping method reads ‘U’ as ‘LJ’, and small L ‘l’ as capital I ‘I’ and ‘O’ as Q. These all issues comes only when I use relative scrape method in citrix. What should I do in this case?

Please also tell me where to set allowedcharacter property.