Digitize Document - An unexpected error has occurred

nbjerke · January 18, 2019, 1:55pm

Scenario:

I am running a process which checks invoices in xml and pdf format. If the process does not find the right info in the xml, I am checking the pdf. I am using the new activity “Digitize Document” activity with google OCR. Through development it has been working like a charm. However, now I am deploying the process to the test environment and the “Digitize activity” fails constantly with the message “Digitize Document: An unexpected error has occurred”.
The error message is not very descriptive and I am having trouble finding where the problem lies. Please help

Steps to reproduce:

Read PDF documents with the digitize document and google OCR

Current Behavior:

Throwing error "Unexpected error has occurred

Expected Behavior:

Reading the pdf with google ocr and outputting the text that was found

Studio/Robot/Orchestrator Version: 2018.4.1

Update: It seems that the Uipath.Vision executeable keeps crashing which causes this acitivty to fail.
Any help on this? @ovi , @loginerror

This is the error that i get:

OCR Failed with error System.Exception at source UiPath.IntelligentOCR.Activities with message An unexpected error has occurred

venkat4u · February 11, 2019, 4:51am

can you share your xaml file.

nbjerke · February 13, 2019, 9:04am

Hi, @venkat4u
Unfortunately i cant share the workflow due to security reasons. I will attach screenshots of the activities and properties i am using. I have tried both google ocr and microsoft.

With google, the activity sometimes run really slow and can take almost 15 minutes or more…

The first log message is when the pdf is downloaded and the second is when the activity is finished processing. When it takes this amount of time, the system times out and logs out resulting in a failed transaction.

With microsoft ocr and digitize document the OCR just keeps crashing, after processing 180 invoices, the activity fails 47 times. I have attached filtered logs from orchestratorOrchestrator Log.xlsx (11.8 KB)

This is the activity and properties i am using:

Let me know if you need any other information

snoopy · March 26, 2019, 8:02am

I have the same problem. “Digitize Document: An unexpected error has occurred”.
I use the Google Cloud OCR engine and in the log it says the size of the load is too high, above 10mb.
I think 10mb is the limit for the OCR API.
But this is weird because the PDF that fails is small and many larger PDF’s works fine.

My work around is to TryCatch and use UiPath’s ‘Read PDF with OCR’ if it fails.
Digitize Document seems to do a better reading but it is also clearly bugged.

Error:
UiPath.SmartData.Digitization.Tokenization.TokenizationException: test.pdf ----> System.Exception: Error performing OCR: GoogleCloudErrorInvalidResponse Request payload size exceeds the limit: 10485760 bytes.

cherose · May 8, 2019, 6:11am

This may be a really late reply for your question, but I just got the same issue and started experimenting, luckily I solved it by changing the scale property of the OCR engine, I think it has something to do with the image quality of your document. The scale value enlarges the document for better recognition but also keep in mind that this may vary on the document type as it can also lead to misinterpretation of the obvious characters if it is too large like 0 and O, etc.

Cheers!

Danancha_Perera · September 2, 2019, 4:00am

Hi Friends i Found a easy solution for above error you just update the Uipath.IntelligentOCR.Activites
and Uipath.MachineLearningExtrator.Activities then take the API key from the Platform.uipath.com Templateless Receipt/Invoice Extractor key and paste in the ML Extractor.

wdag · August 21, 2020, 8:13am

This is really late, but I think that the Digitize Document activity doesn’t allow for scaling in the OCR engine.

I had set scale to 3, and got the same error message. When I removed that and used the default scale for the OCR engine, the error message went away.

Praveen_Venugopal · September 17, 2020, 12:23pm

Well, in my case the digitizing works like a charm for pdf documents, whereas it shows this ‘Digitize Document : An unexpected error occurred’ message when trying for jpg documents.

yrobert · February 2, 2021, 1:50pm

it seems that digitize document doesn’t support a scale set to a high value. It can work with value 2, but generate the error with 3.
Start with default value and see if results are satisfying.
Now with intelligent OCR, you can also use the OCR from the server. more precise

Ferdinand · April 13, 2021, 3:54pm

Duude. So much testing, but updating the packages did the trick!

SaiSree_SUVERA · March 30, 2023, 6:29am

I am getting this error can any one help

Ishan_Shelke · March 30, 2023, 6:41am

Are you trying to read a PDF here ?

If yes then install the OMNIpageOCR activities package, this is the best I have used so far for extracting data from PDF’s after installing replace the UiPath document OCR, with the Omnipage OCR activity

Hope this helps.

SaiSree_SUVERA · March 30, 2023, 6:53am

ok
In community Addition can we extract 5 pages pdf

Topic		Replies	Views
Digitize Document: An unexpected error has occurred Activities activities , question , document_understanding	5	2493	February 1, 2022
Digitize Document activity throws an error with PDFs Studio studio , question , activities_panel	5	1168	October 29, 2021
Digitize document- an unexpected error has occurred Document Understanding activities , question , document_understanding , digitize , microsoft-azure-computer-vision-ocr	4	591	June 19, 2023
Unexpected error while digitizing the document AI Center question , ai_center	6	1943	September 10, 2021
Digitize Document: An unexpected error has occurred UiPath Robot robot , question	7	800	October 19, 2022