HI,
I need to extract data form an image type pfd. The OCR activities are not working and getting errors. Please help me with that…
Thanks
Use Read Pdf with Ocr change the scales and image Dpi and try different Ocr like Teseract,Omnipage
Hi @abivanth.r
Use Read PDF with OCR activity. By Default the scaling will be 2. You can enter scale from 0 to 5. and you can change the Profile too. Usually for Scanned PDF’s keeping Profile as None and scaling 2 should work. This is not same at all cases. So start the scale from 0 and increase it by 0.5. Keep Profile as None or scan. Check in both the ways which the extracting the data accurately.
If Tesseract OCR Engine doesn’t work you can go with OmniPage OCR engine. For this you need to download the dependency UiPath.OmniPage.Activities. In this also you will have Profile and scan. keep that accordingly and check. larger values for Scaling in OmniPage OCR engine fails.
Hope you understand!!
what do the profile property and scale property do? I mean i can read the pdf now. Though it is portrait.The data is getting wrong.
Scaling values start from 1 you can increment it 1.2,1.3 …upto you get the data properly.Change the image DPI it 96,150,256
it enhance your data extraction correctly
Hi,
The pdf has chinese or language related to that. but i am getting some english data only… How can i fix that
Use Omnipage ocr it extract the Pdf Data correctly as compared to other OCR’s it can extract Chinese and English Data by slightly changing the Image DPI and Scales.
install UiPath.omnipage.activity package from manage packages
Thanks man worked. but the data is still inaccurate.
Try changing different Scalings until you get the data properly,But remember ocr activities does not extract the data 100% accurare you to keep the track that which scaling gives you the best data extraction you have to stick on that.
I hope it helps you
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.