Identify whether the document(.pdf) has handwritten text or it is digital

Hi all,

I am working on one document understanding use case, whether we have vendors which sent digital invoices and some hand written invoices

I am using Invoices [out of the box ML model] to identify the mandatory parameter, but sometimes UiPath OCR identifies the value wrong and has high confidence.

is there any way whether we can identify whether it has some hand written text or not, then we can directly park that invoice at action center and other digital invoice will go through ML model.

Classification though keyword based classifier is not possible because sometimes the current vendor which sent digital invoice can also sent hand written invoices, not all parameter but one or two and also if a new vendor get added and sent invoice then keyword classifier will sent null value

Hi @Abhinavpandey

You can try using Document Classifiers with ML Model Classifier o0r Intelligent Keyword Classifier to classify various types of Documents.

Please have a look.

Hope this helps,

Thanks.

with key word based, it is not possible. if we today add few keywords of vendor which gives hand written tomorrow new vendor adds up then classifier will give null value.

i have not tried ML clarifier, thanks i will look and update you

Hi @Abhinavpandey : Have you completed your prototype? What type of clasifier have you used in your design?

hi there,

in digitization scope, set “ApplyOcrOnPdf” to Yes.

When you set to Yes, it digitizes both native and scanned pdfs giving you a higher extraction quality.

you can check this by sending the same documents when “ApplyOcrOnPdf” is set to Auto vs. true.

image

image