My scenario is:
1- Automation Cloud with Enterprise plan and Document Understanding license
2- Training phase of a Document Understanding Modern project with “UiPath Document OCR” as OCR method to extract data from pdf and OCR applied to all pdf
3- A custom document type (structured italian privacy policy) with one language in settings
4- I generated all the necessary fields to extract data and signatures (I only need signatures presence information)
5- The blu boxes are to hide sensible information and are irrelevant to the problem
The extraction engine extracts all the necessary fields which I then validate, including the signatures (green circle).
In some cases a signature (the document contains multiple signatures) is not pre-annotated by the extractor engine (red circle) and is not possible to draw a box around the signature to link it to the correct field.
I am therefore forced to mark the field as missing, thus obtaining a false negative regarding the presence of the signature!
I am at the beginning of my experience with document understanding and training of a project/model, but I believe that if the project has this behavior during training, I will have the same problem when I will use the project in automation with Studio, false negative in the presence of the signature!
Or am I missing something?
Drawing a box with mouse pointer around the red circled signature of the first post, does not appare the dotted red box and the relative pop-up to link to a field.
This signature IS NOT pre-annotated by the engine.
Drawing a box with mouse pointer around the green circled signature of the first post is possible to draw a box to link to the field.
This signature IS pre-annotated by the engine,
The document is only of one type, a structured one with ten page, with two to three signatures on page one, two to four signatures on page two and one to two signatures on page nine.
Composing the data set (fields) with the first uploaded document, i generated all of nine fields for all possible signatures, with a sample document in which the all nine signatures are present (not all signatures are always present in the real documents).
Other information, if it can be useful.
The documents are not scanned, but are “digital” pdfs containing only objects.
Even the signatures, when present, are objects placed in the correct place.
@Luca_Spaccatrosi try to draw larger box around signature. some of signature might have large background size and OCR might consider those extra space as well.
Hi @Darshan_Sable.
I’m waiting permission to share with you from my manager.
Just in case, how can i privately share with you a certain number of documents (the documents contains sensible data that i can’t publicy share)?
Interesting and yes, I think so too.
I’ll try to convert this document in ten jpg/png, build a new pdf and upload into document type to elaborate it with the model.
I will let you know.
@Luca_Spaccatrosi I observe that I was able select all the signature except the signature which are overlapped with the text or too closer to text which is expected. Solution is to ask user to sign properly in free space instead of overlapping it to text or close to text.
Yes, that pdf is one of the cases with signatures not recognized cause overlapping.
And is not possible to ask user to properly sign; the signatures are cattured digitally on a tablet and then the pdf is composed with signature object.
What for the first signature (first in page 1) of pdf 31131970?