Image Scrapping using NewOCR


#1

Hi,

Found this free online OCR (http://www.newocr.com/) which extracted text from images with 100% accuracy in comparison to Uipath’s internal Google OCR which had accuracy issues in extracting numbers, strings, special characters with 100% accuracy. Suggest if you could include the API as part of Uipath for better results. Also, let me know if there is a way to use the API available on the OCR’s website within Uipath.

Also, would like to know how to enable Google Cloud OCR engine.

Thanks!
Sumit


#2

Hi,
First of all no OCR engine is 100% accurate.
No hard feeling pal but please check this list .All top 10 are paid software but still not 100% accurate and you wanna compare some freeware site (http://www.newocr.com/). :smiley:

Regards to google cloud OCR activity.
You need to have API key to access it.
Please follow up for API key
https://cloud.google.com/functions/docs/tutorials/ocr


#3

Yeah absolutely agreed, take my words back… It worked for me though. Now looking for solutions on how to enable OCRs like Abbyy, google cloud OCR in order to leverage the capabilities provided by the Uipath RPA, looking forward to suggestions on how to enable paid OCRs especially Abbyy.


#4

Hey Sumit, you mentioned you compared NewOcr with our own OCR engines and you got better results. Could you please post some samples of the documents you tested with? We’d like to test them ourselves, who knows, there might be something there.


#5

Hi Cosin, the image that i tested was official & confidential so wouldn’t be able to share that however it was of poor quality (210 dpi) and I tested it with three applications which had the following results:

  1. UiPath Builtin: Google OCR couldn’t detect the image properply (~80% accuracy)
  2. NewOCR: Accuracy was high (~>95%) however output was in the text format so automation couldn’t be achieved. only the online version was free, when I explored the API it was pay per use model
  3. Abbyy Finereader 14: ~>95% accuracy with scope of improvement through pattern training
    Would recommend you to try it out with low quality images especially for citrix environment & applications
    I was interested in how I could use Abbyy within Uipath and found one useful thread on the forum itself, will try soon and share results

#6

You can create a custom activity that makes an API call to WhateverOCR.


#7

If GoogleOCR is failing to find the image you might need a different accuracy parameter. If it’s inconsistent because the size of the text changes, you might need a dynamic accuracy that changes when it fails.

But, yeah, most OCRs are not that great if the text is too pixelated.