I have a requirement where multiple PDF files containing images need to be converted into text. I attempted to achieve this using Azure OpenAI services with Python and invoking python from studio, but the extracted text is not consistently accurate.
Do NOT use Azure OpenAI alone for OCR.
Use Azure AI Document Intelligence (Read/Layout model) for text extraction.
Then optionally use OpenAI for summarizing or validating the text.
Hope it resolve your issue
Thanks & Happy Automations
Try to use Read PDF with OCR or Document Understanding activities and apply an OCR engine such as Tesseract, Google, or Microsoft OCR or diff. diff check one by one -which are extract correctly… text, then send the extracted text to OpenAI for further processing if needed.
If helpful, mark as solution. Happy automation with UiPath
Hi @rizvana.mohammed ,
Try this:
- UiPath Document Understanding Approach OR
- For Each PDF
→ Read PDF with OCR
→ OCR Engine: Microsoft Read OCR [Or any OCR]
→ Get Text
→ Post-process text
Use UiPath’s Read PDF Text activity with OCR instead.
Drop Read PDF Text in your loop (for multiple PDFs).
Check Apply OCR = True.
Engine: Microsoft OCR
Tested this on scanned invoices 90%+ accuracy
You can simply use Document Understanding from UiPath, no need for python.
UiPath OCR in digitization stage does extraction accurately..
Alternately, use Read pdf with OCR.
How is the pricing for Document Understanding
Is Microsoft OCR is Microsoft Azure Computer Vision OCR.
If yes, it is asking for Endpoint and API Key . Which Azure service we need add for this
Hi @rizvana.mohammed ,
-
First convert the PDF images to text using a proper OCR engine (Azure Document Intelligence / Read PDF with OCR / Google Vision / Tesseract depending on quality).
-
Then pass the extracted text to Azure OpenAI only for cleanup, summarization, or validation.
If you are using UiPath, the recommended path is Document Understanding with a strong OCR engine (Azure OCR or OmniPage) for extraction, and use GenAI only after OCR. This separation is the only way to get stable and accurate results.
Microsoft OCR is not visible because the required package or prerequisites are missing. You need to install UiPath.MicrosoftAzure.Activities (or the newer Azure Computer Vision OCR package) and also be on a supported Studio version. In newer UiPath versions, Microsoft OCR does not show unless the Azure OCR activities are installed and configured, and sometimes it’s replaced by Document Understanding OCR options instead.
If solution works for you please mark as solved so thread will be closed
Thanks
Hi @rizvana.mohammed
I faced a similar issue and found that Azure OpenAI is not meant for OCR, especially for image-based PDFs, so the text extraction will not be consistent.
The better approach is to use a proper OCR engine first (UiPath Document Understanding, Azure Form Recognizer, or Azure OCR) to convert the PDF images into text. Once you have clean text, you can then use Azure OpenAI for summarization, validation, or further processing.
In short, OpenAI should be used after OCR, not as a replacement for it.
