Document Udnerstanding Data extraction

Ritaman_Baral · August 17, 2023, 8:30am

I am extracting using form extractor a particular table.
In the table a column value is “Propylene” but after extracting the value is coming as “Ropylene”.
How to handle this situation ?

rlgandu · August 17, 2023, 8:35am

@Ritaman_Baral

After extracting data, you can review the extracted values in the Document Understanding Present Validation Station . If you find inaccuracies, correct the extracted value manually. The model will use this feedback to learn and improve its extraction accuracy in the future.

If OCR (Optical Character Recognition) is involved in the extraction process, ensure that the text is being accurately recognized. Poor OCR quality can lead to incorrect extractions. If possible, improve the quality of the source document or experiment with different OCR settings.Use omnipage ocr for correct extraction

or instead of Form extractor use ML Extractor or regex extractor

Vikas_M · August 17, 2023, 8:38am

Hey @Ritaman_Baral ,

Try using the OmniPage OCR or a better OCR such that it will recognises the word properly
or
Check if the word is inside the Boundary box that you indicated while creating the template

image993×534 36.5 KB

3.Validate your extracted data using Present Validation Station activity

Validate accordingly , Below is an example

system · August 31, 2023, 7:48pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
OCR extracted string do not match with same string value stored in excel Help	5	1145	December 20, 2018
How to train machine learning extractor in document understanding Something Else feedback	8	1322	October 3, 2022
I am not able to Extract a table using Form Extractor Document Understanding activities	10	3735	March 1, 2021
How to use the Intelligent OCR for any PDF(other than invoice ) ? Both by Regex and Machine Learning Extractor? Studio uiautomation , activities	7	2498	September 4, 2020
Unable to extract details in to excel Document Understanding activities	16	3357	March 10, 2020

Document Udnerstanding Data extraction

Related topics