Document Understanding Data extraction

I am extracting using form extractor a particular table.
In the table a column value is “Propylene” but after extracting the value is coming as “Ropylene”.
How to handle this situation ?


After extracting data, you can review the extracted values in the Document Understanding Present Validation Station . If you find inaccuracies, correct the extracted value manually. The model will use this feedback to learn and improve its extraction accuracy in the future.

If OCR (Optical Character Recognition) is involved in the extraction process, ensure that the text is being accurately recognized. Poor OCR quality can lead to incorrect extractions. If possible, improve the quality of the source document or experiment with different OCR settings.Use omnipage ocr for correct extraction

or instead of Form extractor use ML Extractor or regex extractor

Hey @Ritaman_Baral ,

  1. Try using the OmniPage OCR or a better OCR such that it will recognises the word properly
  2. Check if the word is inside the Boundary box that you indicated while creating the template

3.Validate your extracted data using Present Validation Station activity
Validate accordingly , Below is an example

