OCR Error(traineddata v3.04)


#1

Scenario:Make a simple OCR activity using Google OCR, in Chinese

Steps to reproduce:get ocr text——》google ocr——》output

Current Behavior: Throw AggregateException

Expected Behavior:Get the text.

Studio/Robot/Orchestrator Version:2017.1.6435

Last stable behavior:
Last stable version:
OS Version:Windows 8.1
Others if Relevant: (workflow, logs, .net version, service pack, etc):
It works with chi_sim.traineddata(v3.04)

Main has thrown an exception

Source: Google OCR

Message: One or more errors occurred.

Exception Type: AggregateException

System.AggregateException: One or more errors occurred. —> System.Exception: Error scraping using host process. Service is not available. —> System.ServiceModel.CommunicationException: There was an error reading from the pipe: Unrecognized error 109 (0x6d). —> System.IO.PipeException: There was an error reading from the pipe: Unrecognized error 109 (0x6d).
at System.ServiceModel.Channels.PipeConnection.FinishSyncRead(Boolean traceExceptionsAsErrors)
at System.ServiceModel.Channels.PipeConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
— End of inner exception stack trace —

Server stack trace:
at System.ServiceModel.Channels.PipeConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
at System.ServiceModel.Channels.DelegatingConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
at System.ServiceModel.Channels.SessionConnectionReader.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.SynchronizedMessageSource.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.TransportDuplexSessionChannel.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.TransportDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)
at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at UiPath.Vision.Hosting.Service.IHostService.Scrape(OCRInput input, OCROptions options, OCROutput& output)
at UiPath.Vision.Hosting.HostServiceProxy.Scrape(OCRInput input, OCROptions options, OCROutput& output)
— End of inner exception stack trace —
at UiPath.Vision.Hosting.HostServiceProxy.Scrape(OCRInput input, OCROptions options, OCROutput& output)
at UiPath.Vision.VisionClient.<>c__DisplayClass12_0.b__0()
at System.Threading.Tasks.Task`1.InnerInvoke()
at System.Threading.Tasks.Task.Execute()
— End of inner exception stack trace —

Server stack trace:
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at UiPath.Vision.VisionClient.ScrapeUsingHostService(OCRInput input, OCROptions options, CancellationToken cancelToken)
at UiPath.Vision.VisionClient.ScrapeImage(OCRInput input, OCROptions options, CancellationToken cancelToken, Boolean useHostProcess)
at UiPath.Vision.UiImage.ScrapeOCR(OCROptions options, CancellationToken cancellationToken)
at UiPath.Core.Activities.OCREngineActivity.<>c__DisplayClass36_0.b__0()
at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs)
at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink)

Exception rethrown at [0]:
at UiPath.Core.Activities.GetOCRText.OnScrapeFaulted(NativeActivityFaultContext faultContext, Exception propagatedException, ActivityInstance propagatedFrom)
at System.Activities.Runtime.FaultCallbackWrapper.Invoke(NativeActivityFaultContext faultContext, Exception propagatedException, ActivityInstance propagatedFrom)
at System.Activities.Runtime.FaultCallbackWrapper.FaultWorkItem.Execute(ActivityExecutor executor, BookmarkManager bookmarkManager)
—> (Inner Exception #0) System.Exception: Error scraping using host process. Service is not available. —> System.ServiceModel.CommunicationException: There was an error reading from the pipe: Unrecognized error 109 (0x6d). —> System.IO.PipeException: There was an error reading from the pipe: Unrecognized error 109 (0x6d).
at System.ServiceModel.Channels.PipeConnection.FinishSyncRead(Boolean traceExceptionsAsErrors)
at System.ServiceModel.Channels.PipeConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
— End of inner exception stack trace —

Server stack trace:
at System.ServiceModel.Channels.PipeConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
at System.ServiceModel.Channels.DelegatingConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
at System.ServiceModel.Channels.SessionConnectionReader.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.SynchronizedMessageSource.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.TransportDuplexSessionChannel.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.TransportDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)
at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at UiPath.Vision.Hosting.Service.IHostService.Scrape(OCRInput input, OCROptions options, OCROutput& output)
at UiPath.Vision.Hosting.HostServiceProxy.Scrape(OCRInput input, OCROptions options, OCROutput& output)
— End of inner exception stack trace —
at UiPath.Vision.Hosting.HostServiceProxy.Scrape(OCRInput input, OCROptions options, OCROutput& output)
at UiPath.Vision.VisionClient.<>c__DisplayClass12_0.b__0()
at System.Threading.Tasks.Task`1.InnerInvoke()
at System.Threading.Tasks.Task.Execute()<—

workflow

google_ocr

tessdata


#2

请问你是怎么安装语言包的啊?


#3

嗨,
请检查


#4

I have the same problem, i followed the instruction in the reference thread and restarted UiPath.
When I scrape text from following image:
image

The action just failed:

Does version a matter?
The current version is 3.05.01, does the language file compatible to different Tesseract versions?
If yes, how to upgrade the UiPath Out-of-the-box Tesseract?


#5

我下载了中文的语言包,也在安装目录下新建了文件夹。把下载的文件复制进去。但是就是没有出现中文的语言选项。请问这是为什么呢?


#6

For some languages you need to download the cube files as well e.g Arabic and Hindi.
You can acces these files from here.

Download the file and place it in
"uipath installation directory"/tessdata eg: C:\Program Files (x86)\UiPath Platform\tessdata
and restart uipath studio.


#7

两种方式可以试试:

  1. 重启UIpath
  2. 换个低版本的chi_sim.traineddata(如3.0.2,下载地址https://github.com/tesseract-ocr/tesseract/wiki/Data-Files)