Read PDF with OCR Microsoft Cloud OCR Try catch

Hello,

I am facing the following issue:

I am using the Read PDF with OCR (Microsoft Cloud OCR) Activity in a while loop on each page of a scanned PDF. Everything works fine, until it throws an exception. This is not that big of a problem because I don’t need information from every page. The problem is that after I catch the Exception, the OCR engine just stops working properly. As you can see in the screenshot below, it works fine until it intercepts an exception. After that, the following pages (6-21) are ‘read’ in a second, wrongly.

The text the OCR extracts from pages 6 to 21 is the same as the one from page 4, the last good extraction before it crashes. Also, when I’m trying to read that page alone, it also throws an error.

Why are the Try catch/Read PDF with OCR activities not working properly? Am I doing something wrong?

Thanks,
Tibi

Hey @Tiberiu_Niculescu

Just to confirm? what is not working? Try catch?
if it is so then can say… just change your catch block Exception type to System.Exception.

if else then let me know.

Regards…!!
Aksh

The problem is not the type of exception. The Try Catch Activity is somehow working properly but the problem is that after the Read PDF with OCR activity throws an error that I catch, the OCR engine just stops reading the file. It just ‘flies’ over the rest of the pages.

->there’s an exception on page 5, I catch it

->after that, the output of pages 6-21 is wrong

ok as you mentioned you are getting a problem with individual page read as well?

There are two things first of all i am really interested about the error message or log you are getting.

and second thing can you share that if not confidential or even single pdf page where you are getting error.

and most important have you tried with Google ocr and local Microsoft ocr?

Regards…!!
Aksh

Server stack trace:
at UiPath.Vision.VisionClient.ScrapeUsingHostService(OCRInput input, OCROptions options, CancellationToken cancelToken)
at UiPath.Vision.VisionClient.ScrapeImage(OCRInput input, OCROptions options, CancellationToken cancelToken, Boolean useHostProcess)
at UiPath.Vision.UiImage.ScrapeOCR(OCROptions options, CancellationToken cancellationToken)
at UiPath.Core.Activities.OCREngineActivity.<>c__DisplayClass36_0.b__0()
at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object args, Object server, Object& outArgs)
at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink)

Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase)
at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData)
at System.Func1.EndInvoke(IAsyncResult result) at UiPath.Core.Activities.OCREngineActivity.EndExecute(AsyncCodeActivityContext context, IAsyncResult result) at System.Activities.AsyncCodeActivity1.System.Activities.IAsyncCodeActivity.FinishExecution(AsyncCodeActivityContext context, IAsyncResult result)
at System.Activities.AsyncCodeActivity.CompleteAsyncCodeActivityData.CompleteAsyncCodeActivityWorkItem.Execute(ActivityExecutor executor, BookmarkManager bookmarkManager)

The page looks like this. I can read it with google OCR.

Tiberiu, can you send me the workflow and the pdf sample please? You can private mesage me

I would, but it contains very sensitive information :frowning:

is there any way you can edit it and just send a sample with which the issue reproduces? i would like to help but i need that sample to investigate.