How to OCR this accurately

sample_font_OCR

I am automating a Citrix app where I must use OCR to recognize and scrape text from the screen.

In the past I have used both Microsoft and Google OCR with UiPath - but for this font - I can’t ANY kind of accuracy

I haven’t used OCR for a few months - and now the scrape wizard no longer shows Microsoft and when I try to use Microsoft engine - I get an error.

Even with Google - there are several undocumented options - :“Screen”, “Legacy”, “Scan” etc - these do all kinds of bizarre things to the OCR - with no explanation. I try using the scale attribute of OCR engine - but that also does not help with this font.

What are my options?? Will ABBY do a better job? Do I have to write my own OCR engine?? How do I call a private OCR engine??

2 Likes

I get 100% accuracy on your example image with the Microsoft engine and a scaling factor of 2. The “Profile” option has the following help text in Studio:

Choose a preprocessing profile for the specified image or UI element to achieve a better OCR read. The following options are available:

  • None - does not apply a preprocessing profile;
  • Screen - preprocessing suitable for remote desktop applications;
  • Scan - preprocessing suitable for scanned files;
  • Legacy - uses the engine’s default settings for preprocessing images, this is the default option.

I used Legacy and got the worst accuracy using Screen, but from this it appears you could also try Screen when working directly in Citrix.

However, I can’t reproduce your issue with the scraping wizard, could you post the exact error text?

3 Likes

Thanks … I used to have Microsoft as an option in scrape wizard but haven’t used in many months … now Microsoft is simply not in the drop down list for ocr engine

Can someone please explain what happened to the Microsoft OCR engine in UiPath???

Several months ago - I did a test project which needed OCR. Back then Uipath provided an option in the wizard to select either Google or Microsoft.

That is no longer the case. Now there is only one option: Google

When I tried just dragging in the microsoft OCR engine into the OCR activity - I just get an exception.

Out of desperation - I tried using ABBY cloud - that does work - but they charge for each OCR. I need to send small snippets containing a single 6 digit number - but ABBY charges for an entire document for just one snippet.

What about Google and Microsoft Cloud OCR??

What are my options here - is Microsoft no longer free?? Where can I get information on that???

Unfortunately, I’m not sure. As I said, I had no issue using Microsoft OCR; this applies to the current the trial and community versions. I’ve now put your topic in the #issues category. In order to aid resolution of your issue, could you post the complete error you get when trying to use Microsoft OCR?

1 Like

If you’re not on Win 10 you’ll need to install it:

1 Like

Heh, then it is a How To after all. :stuck_out_tongue: Thanks, I am on Win 10 and was unaware of this.

I followed the link … downloaded sharepoint designer and tried to install: I get a one line message from the installer:

THE INSTALLATION OF THIS PACKAGE FAILED.

That’s it nothing more.

What about Microsoft Cloud OCR??? How do I use that??

What about OCR.Space - how do I use that ?

If I have to - I will build my own OCR - how do I connect that to UiPath???

For Microsoft Could OCR you need to register to Microsoft Cloud Services and request an API key for OCR from Microsoft, then use that API key to configure the activity.

If you want to build your own OCR, you can create a custom activity and use that in UiPath Studio. Here is a guide on how to do that: https://www.uipath.com/kb-articles/how-to-create-a-custom-activity

1 Like

How do I get a Microsoft cloud API key for OCR.

When I search for Microsoft cloud ocr - I find ocr.space - can I use that instead ???

Does UiPath document ANYTHING???

Thanks … but the question was not how o create a custom activity - but how to interface to my own OCR engine

Where is the documentation on the OCR engine interface?

How do I get a microsoft cloud OCR api key??