Hi all,
Currently I am trying to read a pdf using intelligent OCR. The problem is the document contains both Chinese and English. I was wondering if anybody has a solution to be able to read this whole document in one go, thanks!
Hi all,
Currently I am trying to read a pdf using intelligent OCR. The problem is the document contains both Chinese and English. I was wondering if anybody has a solution to be able to read this whole document in one go, thanks!
You would need to read the file in two passes to get the data. There is not an activity to do both simultaneously.
Hi @kahoyim
First, you can check on your taxonomy file which is located in DocummentProcessing\taxonomy.json if you have referenced the two supported languages.
After that, I would recommend to specify in your OCR engine that you are trying to recognize two languages, you can define that in your Properties panel on Language field.
Here you have an example about how to add more supported languages to Tesseract OCR engine
Hi Andres,
Thanks for your reply, just wondering how can I add two languages in the properties panel? It keeps giving me an error
Thanks
Did you manage to resolve the issue of handling two languages?