Uipath document ocr extracting incorrect value with high confidence

Nirjara_Jain_IND · February 2, 2023, 7:56pm

Hi everyone, I have a query related to OCR.
We are using UiPath document OCR in document understanding framework, we are extracting Amount, and the extracted value is incorrect, still the confidence score is 99%. This data has to entered into system, but there can be consequences to this kind of scenario. How we can handle these kind of scenarios given that we dont have any rule to validate value ?

Actual value = 25-
Extracted value = 25
Confidence score = 99%

Extraction method - ML extractor
Data type in AI centre - text

oraganti931 · February 3, 2023, 4:13am

Hde you tried using any other extractor like form based extractor for that particular field.

supermanPunch · February 3, 2023, 5:12am

Hi @Nirjara_Jain_IND ,

Could you let us know if the type is also the same in the Taxonomy field defined ?

Also, Try enabling the field as Multi-Line in the Data Labelling Session/Taxonomy field and Check whether the - gets extracted.

Nirjara_Jain_IND · February 3, 2023, 5:47am

We have semi structured document, and documents have lot of variation.

Nirjara_Jain_IND · February 3, 2023, 5:48am

In Taxonomy, Data type is also text. Where can I get this multiline facility in AI Centre and Taxonomy. I think Multivalue facility is only with simple field. I am trying to extract data from table.

oraganti931 · February 3, 2023, 5:54am

use different extractor for that particular one field only.not for the entire document

Lahiru.Fernando · February 3, 2023, 7:20am

Hi,

Can you try changing the taxonomy data type to number for this field. This way, it automatically removes special characters in the output.

You can check the output on Action center (click on the field and it shows the “Value” box that has the actual captured and cleansed value)

In addition, the output excel file also has formatted value under “line items - formatted”

Let know if this works…

Topic		Replies	Views
Confidence score for Particular field with Machine Learning Extractor Learning Hub intelligent_ocr	0	1269	February 24, 2020
Machine Learning Extractor OCR confidence has lower value Document Understanding activities , question	6	3519	July 22, 2019
Any demo video/tutorial available for Extract Semi-Structured Document Activity? Help studio	18	3723	April 20, 2020
Fields....could not be found in taxonomy Document Understanding activities , error	3	2357	August 8, 2019
How to use the IntelligentOCR Package Tutorials activities , bestpractices	128	20099	August 12, 2021

Uipath document ocr extracting incorrect value with high confidence

Related topics