The document understanding feature that is released in new update has a problem while extracting data from handwritten forms, the data is not extracted properly, as a change was specificed in previous version and asked to use Form Extractor which bought us a pretty much Accuracy, I was expecting greater accuracy using the ML extractor ( which seems to be less than the earlier Intelligent Form Extractor approach)
Here I have attached the image for specification #uipathcommunity#documentunderstanding#uipath#uipathacademy
These documents seem to have overlapping text which makes them harder to recognize. It would help if you can share 10 of these samples so we can try to reproduce and troubleshoot the issue. Feel free to reach out to me over direct message.
Thanks @Kunal_Jain these are pretty challenging, for 2 reasons:
First the handwritten text overlaps a bit with surrounding printed text, which causes the OCR to get confused and miss some words completely.
Second, even when the text is detected correctly, due to the overlap with surrounding text, the Forms AI also gets confused. Forms AI is aimed at relatively straightforward clean forms, without a lot of variation, and because of this text overlap, it introduces too much entropy into the document text lines than what Forms AI can handle.
In these kinds of situations, ML is the way to go. I recommend training an ML model by labelling 50-100 of these pages in a Document Manager session, and then training a model using the DocumentUnderstanding ML package in AI Center.