Need ideas on how to handle selecting the highest 2 digit number, using OCR etc

Doing an automation in a Citrix environment where we cannot use the UiPath connector, it must all be done as images, OCR, etc.

Most of it is easy. Clicking and typing. But there is this one window…


In that box in the lower right, there could be one row or multiple rows. Each row has a 2 digit number on the front of it. I have to click the row that has the highest 2 digit number (they’re always ordered top down). So if there are rows starting with 01, 02, 03, 04, and 05 I have to click the 05 row.

There could be up to 99 rows which means I also have to deal with there being a scrollbar if it goes over (I think) 5 rows.


So far the only idea I have is a loop where I Get OCR Text the last row, and if it’s not blank then click the scroll bar and check again. At some point the bottom visible row of that box will be blank and we’ll know we are scrolled all the way down. Then move up one line at a time check the text and if it’s blank keep moving up until I get a not blank value and click there.

Hi @postwick,

Not sure if this is of any help, as it is more of a shot in the dark.


Here I would just do some string manipulation from the OCR output. First split at newline and thereafter split at space and get the first element in the array, which will be the digit. This is the easy part.


Here you could minimize the number of activations to scrollbar, if your application layout can be changed, such that the bottom right panel (Facility Fees) is rearranged in the GUI. Then the OCR can read more lines before scrolling down.
If this is not possible then your approach scrolling may be the only sane option. Another question you will have to ask the business user is to confirm if there can be duplicates of digits, for example,


Hidden application shortcuts

Last but not the least, I am sure that system you are interacting with may be designed with mysterious shortcuts. So do try out some shortcut options to see if you can get the Facilities Fees tab as it own tab / window or a way to show all the string text contained within it. Anything which can help you with a better OCR extraction (for example, zooming text) should help in this scenario.

1 Like

The OCR is luckily very accurate on that box, so finding the highest displayed number turned out to be pretty simple.

The next trick will be handling scrolling, but first I have to find a test record that has enough lines in that box.

It’s not possible to rearrange the screen in any way.