What a Robot Sees: Using OCR in RPA

In our post about How to Train Your Robot, we briefly touched on the basic ideas that govern robotic process automation (RPA) and how UiPath Desktop makes using RPA easy for anyone.  It’s worth digging a little deeper into some of the ways UiPath brings automation within the grasp of any computer user, whether you’re a coding master or a self-proclaimed novice.

Perhaps the most useful and versatile tool in the UiPath platform is the Record function. Rather than mapping out a process step-by-step, which can be very time-consuming, any UiPath user can teach a robot to “do as I do” by recording the process as it happens.  The software robot follows your clicks and actions on the screen (aka the presentation layer) and then turns them into an editable workflow.  If you’re working entirely in local programs, that’s as much as you’d need to know.  When accessing remote systems and databases, like Citrix or the open web, a UiPath robot can really show off its abilities.

With remote applications, UiPath’s Record function has a difficult time distinguishing things like buttons and text fields.  The whole application window looks like one big button to the robots.  However, these robots are equipped with optical character recognition (OCR), which allows a computer to distinguish a ‘B’ from a ‘D’, for example, even if the size or font is different.  While recording, a UiPath user can run OCR, select the appropriate text within the window, and the robot will be able to locate that text every single time after.  Even if the text is in a different place, it still works; in fact, using OCR is a much more reliable way to automate.

And it’s not just text that UiPath can recognize, but also images.  Again, in remote applications, everything can look the same to an RPA robot, but UiPath solves this problem with excellent image recognition software.  You simply indicate the image you want your robot to identify in the application window, like a “Create expense report” button, and no matter where it appears on the screen in later processes, the UiPath robots can find it.

If you’d like to see OCR in action, watch this tutorial video.  UiPath can do a lot more than just recognize letters and numbers! 

Sophisticated character and image recognition software is really at the heart of why RPA works today.  You could say that robots have become more perceptive in recent years, though we’re still years away from robots that can make complex decisions based on those perceptions.  Then again, this kind of software has made self-driving cars at Google a reality, so maybe that future is not so far away.

This is a companion discussion topic for the original entry at https://www.uipath.com/blog/automation/what-a-robot-sees-using-ocr-in-rpa