Which files types are supported in Digitize Document activity?

The documentation here includes .bmp as a valid file type to submit for Digitization:
Document Understanding - Digitization (uipath.com)

The documentation here does not include .bmp:
Activities - Digitize Document (uipath.com)

Which is correct? Should .bmp be included?

Hi Joseph Matuch,

  1. If you are using Tesseract OCR engine then you need to convert your image (.png, jpeg, .bpm, so on) into the Image variable.
  2. Use the Load Image Activity to save your image (.bpm) into the image variable.
    Activities - Load Image

Cheers!

Thanks, but just looking for answer of A or B.

Which set of file extensions is complete as far as those supported by the OCR activities within the Digitize stage of the framework?

A) .pdf, .png, .gif, .jpe, .jpg, .jpeg, .tiff, .tif, .bmp
OR
B) .pdf, .png, .gif, .jpe, .jpg, .jpeg, .tiff, .tif

Anyone able to say with confidence whether .bmp is accepted by the Digitize Document activity?

If need be, I can find one and run the process and see if I get an error, but hoping someone has experience with this and can just answer yes or no???

HI,

It seems bmp is not supported in DigitizeDocument activity as the following.

As a workaround, we can convert it to png, then digitize it using DigitizeDocument activity as the following.

Sample
Sample20240213-3.zip (41.3 KB)

Regards,

Thank you for testing that out for me, @Yoichi . Much appreciated.

EDIT: Such a simple solution, just having a string with .bmp at the end. I should have thought of that! Nice.

EDIT 2:
I tried this out myself today (2/23/24), and I actually did not see that error with a .bmp file (or .gif file, which I was also curious about).
I tried again with a completely different filetype I was sure would not be accepted (.HEIC), and it said “Digitize Document - Digitize: The extension ‘.HEIC’ does not have a known content type defined.”
I tried with .txt, and got the error message @Yoichi received earlier: “Digitize Document - Digitize: Unsupported content type: text/plain.” So I think with UiPath.IntelligentOCR.Activities 6.5.0, .bmp and .gif are accepted, even though this says they are not: Activities - Digitize Document (uipath.com)

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.