Please help me understand the following in DU
- What is the difference between confidence and OCR confidence
- I am using form extractor. A field city getting extracted from a different pdf…however the position is same as trained in form extractor
- What is the meaning of min overlap percentage. Please explain in layman’s term
@Ritaman_Baral
Confidence: Overall certainty of extracted data based on layout, context, and training; higher score means more reliable data.
OCR Confidence: Certainty of text recognition by the OCR engine; higher score means more accurate text extraction.
Form Extractor Issue: If the city field is extracted from a different PDF but in the same position, the position-based extraction is working, but content varies.
Min Overlap Percentage: Defines the required overlap between the trained extraction region and the new document’s region for a field to match.