OCR Confidence Level -1 (Microsoft OCR engine)

cherose · May 11, 2019, 1:01pm

I have been testing the OCR engines, I am using th eintelligentOCR’s Data Extraction Scope which returns ExtractionResults. As part of that object, I am able to see the field’s confidence level by looping through the extractedData.ResultsDocument.Fields, then the item.Values(0).OcrConfidence returns the confidence that the OCR engine extracted correctly. It works perfectly fine with GoogleOCR, but when I started testing MicrosoftOCR, it is able to find the field but it kept on returning -1 as the OcrConfidence.
Using GoogleOCR

Using MicrosoftOCR

Can someone please explain to me why and how can I work around with that, just in case that it is an open issue? Thanks

loginerror · May 14, 2019, 2:00pm

Hi @cherose

This is interesting. Any chance you could provide a zip of the project that reproduced the issue?

cherose · May 14, 2019, 2:31pm

Sorry, but I can’t… but I can provide the flow

A document(.pdf) is being processed in “Digitize Document” with Microsoft OCR (Properties: Language = “da”; Profile = “Scan”, Scale=2) → this will return the DOM which has the ocrConfidence of -1 already…

then for further processing, “Load Taxonomy” → “Data Extraction Scope with Simple Document Data Extraction Activity” that returns the ExtractionResult, which is still -1 of course since it has been derived from the DOM.

I hope you could still help me understanding it with those details. It is pretty much the same flow that I used for Google OCR which it actually returned reasonable OCR confidence levels.

Thank you.

tudor.serban · May 15, 2019, 10:32am

Hi @cherose,

The Microsoft OCR engine that we are using under the hood of our activity does not return any confidence information for us to pass on, as opposed to the Google OCR. Consequently, we set the confidence to -1, meaning “Unknown”.

cherose · May 15, 2019, 4:57pm

Thank you for your reply
That is all I need, a clarification on why Microsoft OCR is different from the others despite of the same processing.

system · May 18, 2019, 4:57pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Confidence score for Particular field with Machine Learning Extractor Learning Hub intelligent_ocr	0	1274	February 24, 2020
OCR confidence vs confidence Activities excel , studio	2	179	July 24, 2024
Document Understanding topic- I want to see confidence column in output result Something Else feedback	2	709	August 4, 2021
How to get ocr confidence of table fields Activities activities , question , mlservices	0	457	February 24, 2023
Document Understand confidences Document Understanding	5	523	September 26, 2023

OCR Confidence Level -1 (Microsoft OCR engine)

Related topics