Digitize Document: The extension '.xml' does not have a known content type defined

Hi Community

I am trying use the the intelligent ocr activites to extract values from invoices.

However, when I use ‘Digitize Document’ on an xml or htm file, I receive this error:

‘Digitize Document: The extension ‘.xml’ does not have a known content type defined’

is it simply not possible to digitize these formats as of now?

I have used the Microsoft and Abbyy ocr engines, can the engine effect the outcome?

HI @ChrisPals

I have used Intelligent OCR, but for invoices stored as pdf or image files. NOt really xml or htm… Those were working fine… I don’t think OCR Engine has anything to do with this…
But, Have you tried to extract the value through ABBY FlexiCapture directly into an Excel file to see whether that supports it?

If that works, then it wouldn’t be a problem to get that through UiPath as I think

1 Like

when I use ‘Digitize Document’ on an pdf and .docx file, I receive this error:

‘Digitize Document: The extension ‘…docx’ does not have a known content type defined’