Why are different results observed when an ML Skill is used for data extraction in ML Extractor activity and same ML Skill is being used for pre-labelling on same document ?
Few differences can be observed while using an ML Skill through ML extractor on document and if the same ML Skill is used for pre-labelling, results observed (fields identified and text extracted in Data Manager) are different.
If ML extraction results are not satisfactory and if the results are to be consistent, OCR needs to be force applied on documents in the DU process as OCR in data manager is applied to all documents but the Digitize activity only applies OCR if document contains images(Unless ForceApplyOCR is set to True). This can be ensured by doing either of below,
- Enable the ForceApplyOCR property in Digitize Document activity
- Enable UseServerSide property on ML Extractor property
Read more on