Image to Text Conversion

rizvana.mohammed · December 23, 2025, 2:10pm

I have a requirement where multiple PDF files containing images need to be converted into text. I attempted to achieve this using Azure OpenAI services with Python and invoking python from studio, but the extracted text is not consistently accurate.

nishi_jain · December 23, 2025, 2:18pm

Hi @rizvana.mohammed

Do NOT use Azure OpenAI alone for OCR.
Use Azure AI Document Intelligence (Read/Layout model) for text extraction.
Then optionally use OpenAI for summarizing or validating the text.
Hope it resolve your issue

Thanks & Happy Automations

prashant1603765 · December 23, 2025, 2:18pm

Hi @rizvana.mohammed

Try to use Read PDF with OCR or Document Understanding activities and apply an OCR engine such as Tesseract, Google, or Microsoft OCR or diff. diff check one by one -which are extract correctly… text, then send the extracted text to OpenAI for further processing if needed.

If helpful, mark as solution. Happy automation with UiPath

sayali.p · December 23, 2025, 2:20pm

Hi @rizvana.mohammed ,

Try this:

UiPath Document Understanding Approach OR
For Each PDF
→ Read PDF with OCR
→ OCR Engine: Microsoft Read OCR [Or any OCR]
→ Get Text
→ Post-process text

arjun.shiroya · December 23, 2025, 2:22pm

Hi, @rizvana.mohammed

Use UiPath’s Read PDF Text activity with OCR instead.

Drop Read PDF Text in your loop (for multiple PDFs).

Check Apply OCR = True.

Engine: Microsoft OCR

Tested this on scanned invoices 90%+ accuracy

MohammedShabbir · December 23, 2025, 3:53pm

You can simply use Document Understanding from UiPath, no need for python.
UiPath OCR in digitization stage does extraction accurately..

Alternately, use Read pdf with OCR.

rizvana.mohammed · December 23, 2025, 4:13pm

How is the pricing for Document Understanding

rizvana.mohammed · December 23, 2025, 4:20pm

Is Microsoft OCR is Microsoft Azure Computer Vision OCR.

If yes, it is asking for Endpoint and API Key . Which Azure service we need add for this

arjun.shiroya · December 23, 2025, 4:41pm

@rizvana.mohammed

Refer tto this video:

rizvana.mohammed · December 23, 2025, 4:59pm

Microsoft OCR is not visible for me . do you know what could be the reason

Maheep_Tiwari · December 23, 2025, 6:00pm

Hi @rizvana.mohammed ,

First convert the PDF images to text using a proper OCR engine (Azure Document Intelligence / Read PDF with OCR / Google Vision / Tesseract depending on quality).
Then pass the extracted text to Azure OpenAI only for cleanup, summarization, or validation.

If you are using UiPath, the recommended path is Document Understanding with a strong OCR engine (Azure OCR or OmniPage) for extraction, and use GenAI only after OCR. This separation is the only way to get stable and accurate results.

Maheep_Tiwari · December 23, 2025, 6:01pm

Microsoft OCR is not visible because the required package or prerequisites are missing. You need to install UiPath.MicrosoftAzure.Activities (or the newer Azure Computer Vision OCR package) and also be on a supported Studio version. In newer UiPath versions, Microsoft OCR does not show unless the Azure OCR activities are installed and configured, and sometimes it’s replaced by Document Understanding OCR options instead.

nishi_jain · December 24, 2025, 7:57am

Hi @rizvana.mohammed

If solution works for you please mark as solved so thread will be closed
Thanks

Monali_Vekariya · December 24, 2025, 9:59am

Hi @rizvana.mohammed
I faced a similar issue and found that Azure OpenAI is not meant for OCR, especially for image-based PDFs, so the text extraction will not be consistent.

The better approach is to use a proper OCR engine first (UiPath Document Understanding, Azure Form Recognizer, or Azure OCR) to convert the PDF images into text. Once you have clean text, you can then use Azure OpenAI for summarization, validation, or further processing.

In short, OpenAI should be used after OCR, not as a replacement for it.

Topic		Replies	Views
Some Best OCR API for converting Readable text PDF Studio studio , question , tools	1	707	October 9, 2022
How can we use Google cloud vision OCR & Microsoft Azure Vision OCR? UiPath Document Understanding Activities activities , question , document_understanding	2	1329	March 23, 2022
Extract information from PDF using Azure Open AI AI Center question , ai_center	3	322	September 25, 2024
PDF + Azure Form recognizer + azure OpenAI Activities pdf , activities , studio , question , azure , microsoft-azure-form-recognizer , openai , azure-openai	3	792	April 8, 2024
Send PDF as input to OpenAI Activities pdf , activities , question	1	122	September 23, 2024

Image to Text Conversion

Related topics