Reading text from an image

Hi everyone

I have an image and I want to fetch a particular text from it. It’s always start like “ICN - _ _ _ _ _ _”. The image changes everytime. The image is quite big but i am sharing a small part of it.
image

You can use OCR to read the whole text, then try using String functions, like substring and split to extract the desired string.
Let me know if you would like an example :slight_smile:
Cheers!

Hi @Krishna_Sanghi,
As you say this is a big image file and you are interested only in the bottom part, it is best to first do some image manipulations (clipping only a certain region) before sending it to UiPath.

Once the image is processed, it is a simple workflow in UiPath. Since the image attached is not high resolution I get the text as: “'TFE_NTxvæe-77Z12-a-K0630-2n1+A-m2-01”
ReadingImageText.xaml (5.6 KB)

Hi @jeevith

I tried your workflow, with a different complete image directly saving it from a pdf, but the result is not accurate. It is showing some deviation from actual number

Hi @ziga.hanzic

I tried OCR, but result is not accurate

Hi @Krishna_Sanghi,

Sadly that is the limitation of free ocr engines.
You can build your own (pytesseract) which is better than the free ocr engines. Set up a local server running pytesseract taking input images and extracting text. I have expiremented with it and would recommend that.
Or
Get api access to Microsoft vision Api, that is quite a strong deep learning model, which blows past traditional OCR engines. May be even Uipath document understanding is a better option for this than OCR.

Which ever you choose you will need to perform these steps in a trial and error manner until you get good reliable results:

  1. Clipping to the actually part of the image containing the text
  2. Convert the resulting image to grayscale
  3. Increase the contrast a bit
  4. Run the ocr engine or deep learning API in UiPath
  5. If result still not optimal, repeat Step 3

This is also the same best practices used with OCR, image recognition and image segmentation problems.

Hi @jeevith

I tried using tesseract as well but it is also giving wrong output. Do you think I need to train it as the number always appears at the right bottom corner. I searched on net it suggesting about training tesseract

Hi @Krishna_Sanghi,

I do not think you need to retrain the tesseract model. It works really well even for quite unclear text (I used pytesseract). But I cannot stress enough on the importance of pre-processing the image before sending it to UiPath or the tesseract (Step 1 to 3).

You could include the image pre-processing as part of your automation pipeline as well by using Python activities in UiPath. OpenCV Python script to do the pre-processing and then either use pytesseract or send the processed image to UiPath OCR to test the outputs.

Check this pre-processing options using OpenCV. I usually use the first two options whenever I have an image related automation or recognition project.

@Krishna_Sanghi - Pleas try the google cloud vision OCR… I have used it to extract the data from Claim Processing form , in my case this is the only one worked pretty well compared to all other OCR.

For this, you have to go to your google account -> enable that particular api option and get the key to use it in Uipath…

Hi @prasath17
I tried it but it is asking for enabling billing. So that means it is chargeable.??

@Krishna_Sanghi - I think its free for 60 days…Are you going to use it in Production?? If yes and it’s works well for your case…then you may have to talk to your management about the pricing …

Hi @prasath17

Yeah i can talk to the management, if they allow, but i was figuring of something that will cost no money

Hi @jeevith

That’s quite a complicated thing you explained. I have to search more on this and their’s no pytesseract activity, even in packages.

@Krishna_Sanghi - Try converting image to txt or Image to PDF --> PDF txt…

if none of the option works, you have to try docoument understanding. In DU process you either use intelligent OCR or Regex based on the extracted text from your image.

@Krishna_Sanghi - Here is another option of extracting the text using CV Extract Text inside the CV Screen Scope…

XAML

Output

image

Note: I just tried whatever image your provided and it worked…I would suggest suggest to explore this option too…

1 Like

Hi @prasath17,
Cool, definitely seems to perform much better than stock tesseract or Microsoft OCR engines. Just to double check with you, does this require a dedicated API key (UiPath Screen OCR)?

Hi @jeevith - Yes you have get the Computer Vision API Key from your cloud.uipath…That UIPath Screen OCR comes along with that doesn’t require any dedicated API key…and when you first use CV screen scope you will get the below pop up…

image

1 Like

Hi @prasath17

I was searching this activity but it requires the latest version, and currently i am using 19. something. So waiting for the management team to upgrade it. Then only i’ll be able to use it.