Install other language for Google OCR

ocr
studio
language

#1

I have downloaded the trained data language file(chi_tra) from https://github.com/tesseract-ocr/tessdata/tree/3.04.00199 save file “uipath installation directory”/tessdata eg: C:\Users\yuans\AppData\Local\UiPath\app-17.1.6522\tessdata (I used community version) but it didn’t work for Google OCR engine. I choose language as chi_tra. Below is error message. Any idea?

Main has thrown an exception

Source: Google OCR

Message: One or more errors occurred.

Exception Type: AggregateException

System.AggregateException: One or more errors occurred. —> System.Exception: Error scraping using host process. Service is not available. —> System.ServiceModel.CommunicationException: There was an error reading from the pipe: Unrecognized error 109 (0x6d). —> System.IO.PipeException: There was an error reading from the pipe: Unrecognized error 109 (0x6d).
at System.ServiceModel.Channels.PipeConnection.FinishSyncRead(Boolean traceExceptionsAsErrors)
at System.ServiceModel.Channels.PipeConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
— End of inner exception stack trace —

Server stack trace:
at System.ServiceModel.Channels.PipeConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
at System.ServiceModel.Channels.DelegatingConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
at System.ServiceModel.Channels.SessionConnectionReader.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.SynchronizedMessageSource.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.TransportDuplexSessionChannel.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.TransportDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)
at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at UiPath.Vision.Hosting.Service.IHostService.Scrape(OCRInput input, OCROptions options, OCROutput& output)
at UiPath.Vision.Hosting.HostServiceProxy.Scrape(OCRInput input, OCROptions options, OCROutput& output)
— End of inner exception stack trace —
at UiPath.Vision.Hosting.HostServiceProxy.Scrape(OCRInput input, OCROptions options, OCROutput& output)
at UiPath.Vision.VisionClient.<>c__DisplayClass12_0.b__0()
at System.Threading.Tasks.Task`1.InnerInvoke()
at System.Threading.Tasks.Task.Execute()
— End of inner exception stack trace —

Server stack trace:
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at UiPath.Vision.VisionClient.ScrapeUsingHostService(OCRInput input, OCROptions options, CancellationToken cancelToken)
at UiPath.Vision.VisionClient.ScrapeImage(OCRInput input, OCROptions options, CancellationToken cancelToken, Boolean useHostProcess)
at UiPath.Vision.VisionClient.Scrape(OCRInput input, OCROptions options, CancellationToken cancelToken, Boolean useHostProcess)
at UiPath.Vision.UiImage.ScrapeOCR(OCROptions options, CancellationToken cancellationToken)
at UiPath.Core.Activities.OCREngineActivity.<>c__DisplayClass36_0.b__0()
at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs)
at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink)

Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase)
at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData)
at System.Func1.EndInvoke(IAsyncResult result) at UiPath.Core.Activities.OCREngineActivity.EndExecute(AsyncCodeActivityContext context, IAsyncResult result) at System.Activities.AsyncCodeActivity1.System.Activities.IAsyncCodeActivity.FinishExecution(AsyncCodeActivityContext context, IAsyncResult result)
at System.Activities.AsyncCodeActivity.CompleteAsyncCodeActivityData.CompleteAsyncCodeActivityWorkItem.Execute(ActivityExecutor executor, BookmarkManager bookmarkManager)
—> (Inner Exception #0) System.Exception: Error scraping using host process. Service is not available. —> System.ServiceModel.CommunicationException: There was an error reading from the pipe: Unrecognized error 109 (0x6d). —> System.IO.PipeException: There was an error reading from the pipe: Unrecognized error 109 (0x6d).
at System.ServiceModel.Channels.PipeConnection.FinishSyncRead(Boolean traceExceptionsAsErrors)
at System.ServiceModel.Channels.PipeConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
— End of inner exception stack trace —

Server stack trace:
at System.ServiceModel.Channels.PipeConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
at System.ServiceModel.Channels.DelegatingConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
at System.ServiceModel.Channels.SessionConnectionReader.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.SynchronizedMessageSource.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.TransportDuplexSessionChannel.Receive(TimeSpan timeout)
at System.ServiceModel.Channels.TransportDuplexSessionChannel.TryReceive(TimeSpan timeout, Message& message)
at System.ServiceModel.Dispatcher.DuplexChannelBinder.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at UiPath.Vision.Hosting.Service.IHostService.Scrape(OCRInput input, OCROptions options, OCROutput& output)
at UiPath.Vision.Hosting.HostServiceProxy.Scrape(OCRInput input, OCROptions options, OCROutput& output)
— End of inner exception stack trace —
at UiPath.Vision.Hosting.HostServiceProxy.Scrape(OCRInput input, OCROptions options, OCROutput& output)
at UiPath.Vision.VisionClient.<>c__DisplayClass12_0.b__0()
at System.Threading.Tasks.Task`1.InnerInvoke()
at System.Threading.Tasks.Task.Execute()<—


#2

Hi,
Try these:

  • Do you mind installing older version of the tessdata and give a try.(make sure to restart the studio/machine)

  • For some languages you need to download the cube files as well .You can access these files from here


#3

Actually I have installed old version then restarted computer and it still didn’t work. For Chinese there is no cube file. Any other idea? Thanks.


#4

I can see for chinese
https://github.com/tesseract-ocr/tessdata/blob/3.04.00/chi_tra.traineddata


#5

This was what I downloaded but it didn’t work. Thanks.


#6

I have downloaded all files under https://github.com/tesseract-ocr/tessdata/blob/3.04.00/ it seemed some other language such tha works. But for chi_tra and chi_sim it didn’t work. Thanks.


#7

Try from this link
looping @Gabriel_Tatu in case you didn’t succeed.


#8

Thanks for the link I have downloaded latest version 4.00 about chi_tra but still didn’t work. Which version I should choose for UiPath now I’m using UiPath community version 2017.1.6522. Any idea?


#9

Try with 33 please, i ll add the link as soon as i can.


#10

@Gabriel_Tatu, In github there is only 3.04 and 4.00 there is no link for 3.3. Would you please show me the link? Thanks a lot.


#12

I tested 3.02 version should work for traditional Chinese. Thanks for everyone’s support.