Need help with UiPath OCR image Recognition

,

Hi there,

I need to extract 4 fields from an image, and copy them into an excel table. The fields are:

  • Name – (Nombre)
  • Surnames – (Apellidos)
  • Leaving date – (Fecha de la BAJA)
  • Discharge date – (Fecha de la ALTA)

The main problems that I find are:

  • All of them are pictures. (Some scanned into a .PDF, and some as a .jpg files)
  • All of them are structured into sections (Rectangles)
  • The quality of the pictures is very diverse. (Some with shadows, other taken in perspective, with misaligned margins, etc.)
  • The structure of the document is different depending of the region of My country they come from. So the 4 fields are located in different positions. ( But that is a problem to deal with later. At the moment I’m sticking to a single type of document).

I’m aware that the first step is to get the documents with the highest possible quality. (I’m working on that). But from there, I’ve tried everything I know about image and PDF OCR recognition within UiPath. But the quality of the output txt files is very poor. And to identify the fields to extract, and an anchor, is complicated. (Sometimes it recognize “name”, “mame”, ”n@me”, nane”,… you get the idea…). Maybe it will require further text processing through programming. I honestly don’t know.

Considering all the above, does anyone know if there is another powerful OCR program that can be integrated within UiPath? (One Note seems to work pretty good, but maybe there is another more advanced, so we do not have to improve the image manually, using Power Point or Photoshop before the OCR). And also, Hoy can I identify the different sections? (Maybe zooming into them?)

I send attached an example of an actual Discharge date document, to illustrate the problems I told above.

Thank you very much in advance.

Kind regards,

Alberto.

I’m not sure on the OCR tools available that can be used alongside UiPath, but I can share some ideas to help it work with the tools already provided.

So basically, the Scale parameter sets a sequence of boxes and it looks in each box. If every character is aligned perfectly in each box then your accuracy will be high. So the trick is finding the sweet spot between zooming out/in of the document and the Scale that doesn’t cut off parts of characters. For example an 8 could be read in as a 3 if the Scale is not correct cutting part of some of the characters.

I’m not an expert though, and am not sure how static that Scale alignment is. Because if it’s static then you need to adjust the placement of the text so the characters are aligned consistently each time. And, if the text is different sizes then you need a dynamic Scale that changes for certain areas.

So, you can find the sweet spot but the challenge is going from document to document and I think it’s doable but requires additional logic and checks.

Hopefully we see some improvements in this technology. Supposedly, I have heard good things about Abbey’s OCR engine that you might look into if not already.

Regards.

1 Like