Document Understanding - How to read colors from a (scanned) table

Hey,

this might be an unusual problem but maybe you’re able to help me / point me in the right direction.

We’re working a lot with document understanding and have (re)trained a fair amount of models to fit our needs.
But now we’ve got a new kind of document and so far I have no idea how to process it, without writing my own python model.

Imagine people use printed plans/tables and the “data” is simply the color of the field. They’re coloring the fields either red, green or leave it blank.

Here’s an example how this might look like:

image

Just imagine it being a scanned document, not a screenshot.

How/Where would I start processing this document.

I basically need the output to be a table (datatable/excel/whatever) which I can use for further processing.

Questions:

  1. Is this even remotely possible with UiPath?
  2. If I were to write my own model/logic, where would I start, do you know of any libraries doing anything similar?

So far I find it pretty hard to even google this problem…

Thanks in advance,
T0Bi

Can you rely on the pixel positions of the coloured boxes? Then this may work quite easy. But if postions may changes it is not so easy any more. Think where the data came from and ask there. But in the end this makes no stable and beautiful process.
Now to my approach: I’d search and replace in the graphics for the coloured boxes and replace them with words in the graphic. Then run the OCR to get the content as usual and remove spaces.

As the documents are scanned, nothing’s reliable unfortunately.

That was one of my ideas as well. Do you know of any library/framerwork which can search/replace colored boxes? :smiley:

Because it’s not as simple as it sounds.

No, you will have to do the code by yourself, and it depends on the quality of input data how reliable this works.