Need to check if document is handwritten or typed

I have used the document understanding with UiPath OCR to extract information from the handwritten or typed form. Now the question is that the form that I have used can be sent as a typed or handwritten form. Is there a way for me to find if the form is typed or handwritten in the first place?


Is there any word or text which makes typed document unique from hand written

If there is any such we can read and have the text as a string variable
Then Validate with a IF condition like this

var_string.ToString.Contains(“your unique word”)

Cheers @Vishal_Kalra

Thanks @Palaniyappan for your reply. Unfortunately, the user’s can fill the same form in both typed or handwritten manner. So I do not have any word to help me segregate it. I want to present the validation station only if it is handwritten form. So is there a way for me to do it?

Well then you can send it to validation station before processing

Cheers @Vishal_Kalra

Hey @Vishal_Kalra

Is it possible for you to share the form image maybe after looking at it I can try to give solution.

Hi @Palaniyappan , I want to present the validation station only for handwritten forms. ( due to large number of forms) I don’t need to do it when the forms are digitally typed. So I need a way to segregate them.

There? @Vishal_Kalra

Is it possible to share the form image?

PFA the sample data. This same data can be typed as well!

Segmentation of text is still in long run. There is no way it can be done without a pre-check qualifier, at least not based on preprocessing of data. If there is way on segmentation based on a qualifier, classification is possible.

Now to answer your question, there is a way out by creating a model and training it do classification but this can not achieve 100% result by any mean.

Did you try using the Document Classification? Just instead of field value pair, have keyword identification. Do the identification for same word in both format. ML classifier can be of some help.

You can create a data set for Handwritten and Printed, then train the model for both types. At least 6 cases for printed and handwritten each.

To confirm, that won’t be 100% in any which way, Just tat you can have a better accurancy if you train well.

Can you send a filled scanned form? @Vishal_Kalra

@Parth_Doshi : It is the same form digitally filled.

Thanks @rahulsharma . Let me check on this.

Hi @rahulsharma , I tried with the Keyword based classifier but the classification result is always null. Do you have any sample project already?