I am running a process which checks invoices in xml and pdf format. If the process does not find the right info in the xml, I am checking the pdf. I am using the new activity “Digitize Document” activity with google OCR. Through development it has been working like a charm. However, now I am deploying the process to the test environment and the “Digitize activity” fails constantly with the message “Digitize Document: An unexpected error has occurred”.
The error message is not very descriptive and I am having trouble finding where the problem lies. Please help
Steps to reproduce:
Read PDF documents with the digitize document and google OCR
Current Behavior:
Throwing error "Unexpected error has occurred
Expected Behavior:
Reading the pdf with google ocr and outputting the text that was found
Studio/Robot/Orchestrator Version: 2018.4.1
Update: It seems that the Uipath.Vision executeable keeps crashing which causes this acitivty to fail.
Any help on this? @ovi , @loginerror
This is the error that i get:
OCR Failed with error System.Exception at source UiPath.IntelligentOCR.Activities with message An unexpected error has occurred
Hi, @venkat4u
Unfortunately i cant share the workflow due to security reasons. I will attach screenshots of the activities and properties i am using. I have tried both google ocr and microsoft.
With google, the activity sometimes run really slow and can take almost 15 minutes or more…
The first log message is when the pdf is downloaded and the second is when the activity is finished processing. When it takes this amount of time, the system times out and logs out resulting in a failed transaction.
With microsoft ocr and digitize document the OCR just keeps crashing, after processing 180 invoices, the activity fails 47 times. I have attached filtered logs from orchestratorOrchestrator Log.xlsx (11.8 KB)
I have the same problem. “Digitize Document: An unexpected error has occurred”.
I use the Google Cloud OCR engine and in the log it says the size of the load is too high, above 10mb.
I think 10mb is the limit for the OCR API.
But this is weird because the PDF that fails is small and many larger PDF’s works fine.
My work around is to TryCatch and use UiPath’s ‘Read PDF with OCR’ if it fails.
Digitize Document seems to do a better reading but it is also clearly bugged.
This may be a really late reply for your question, but I just got the same issue and started experimenting, luckily I solved it by changing the scale property of the OCR engine, I think it has something to do with the image quality of your document. The scale value enlarges the document for better recognition but also keep in mind that this may vary on the document type as it can also lead to misinterpretation of the obvious characters if it is too large like 0 and O, etc.
Hi Friends i Found a easy solution for above error you just update the Uipath.IntelligentOCR.Activites
and Uipath.MachineLearningExtrator.Activities then take the API key from the Platform.uipath.com Templateless Receipt/Invoice Extractor key and paste in the ML Extractor.
Well, in my case the digitizing works like a charm for pdf documents, whereas it shows this ‘Digitize Document : An unexpected error occurred’ message when trying for jpg documents.
it seems that digitize document doesn’t support a scale set to a high value. It can work with value 2, but generate the error with 3.
Start with default value and see if results are satisfying.
Now with intelligent OCR, you can also use the OCR from the server. more precise
If yes then install the OMNIpageOCR activities package, this is the best I have used so far for extracting data from PDF’s after installing replace the UiPath document OCR, with the Omnipage OCR activity