Digitize Document: The extension '.xml' does not have a known content type defined

ChrisPals · January 6, 2020, 9:05am

Hi Community

I am trying use the the intelligent ocr activites to extract values from invoices.

However, when I use ‘Digitize Document’ on an xml or htm file, I receive this error:

‘Digitize Document: The extension ‘.xml’ does not have a known content type defined’

is it simply not possible to digitize these formats as of now?

I have used the Microsoft and Abbyy ocr engines, can the engine effect the outcome?

Lahiru.Fernando · January 8, 2020, 2:06pm

HI @ChrisPals

I have used Intelligent OCR, but for invoices stored as pdf or image files. NOt really xml or htm… Those were working fine… I don’t think OCR Engine has anything to do with this…
But, Have you tried to extract the value through ABBY FlexiCapture directly into an Excel file to see whether that supports it?

If that works, then it wouldn’t be a problem to get that through UiPath as I think

kshenam · September 16, 2020, 8:22pm

when I use ‘Digitize Document’ on an pdf and .docx file, I receive this error:

‘Digitize Document: The extension ‘…docx’ does not have a known content type defined’

ydhanabalan · August 24, 2023, 10:27am

@kshenam
have you got any answer for this. because I have the same error. Can you please help me on this

Topic		Replies	Views
Digitize Document: The extension '.doc' does not have a known content type defined Studio	5	687	May 25, 2022
Digitize Document: The extension '.xlsx' does not have a known content type defined StudioX studio , question , activities_panel	3	700	June 14, 2022
The extension .docx does not have a known content type Activities activities , question , document_understanding	6	314	August 24, 2023
Error in Digitize document Other Products bug	7	567	August 12, 2023
Digitize Document Eroor Studio ocr , studio , question , workflow_analyzer	2	940	April 10, 2022

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

Digitize Document: The extension '.xml' does not have a known content type defined

Related Topics