When i am trying to fetch the name of the person (refer image) using ocr i am getting some extra word with it like gK gE gN gY gO gN gg WgE gI gS gS gN gA gT ggggggg instead of KENYON WEISSNAT .How to fix this issue
Try change the OCR engines like Tesseract OCR, Microsoft etc.
Try with other OCR engines
Hi,
First try to use ocr activities and try to change the scale starting range from 1 increment the range upto you get the data correctly or try differnt ocr’s
Thank you
Tried all engine getting the same result its taking value of textbox as ‘g’ which i understood
Accuracy is correct i guess but i am getting textbox as ‘g’ with the real value .Is there any property to skip those textbox.
Try with OmniPage Ocr Engine
Use Get PDF with OCR try using OCR engines like Tesseract OCR or OmniPage OCR to get better and accurate results For OmniPage OCR download the package UiPath.OmniPage.Activitites.
You can try changing the scaling too.
Hope it helps!!
Regards,
Use Tesseract OCR and try change the scaling to get the better results.
I’ll invite you to try different ocr engines parameters.
For example
- Language: You can specify the language(s) for OCR recognition.
- OCR Engine Version: Depending on the UiPath Studio version and OCR activities used, you might have the option to choose between different Tesseract OCR engine versions.
- Page Segmentation Mode: This parameter helps in determining how Tesseract should interpret the layout and structure of the text on the page. Options may include Auto, Single Column, Single Block, Sparse Text, and more.
- Image Preprocessing Options: UiPath may offer some built-in image preprocessing options to improve OCR accuracy. These could include resizing, deskewing, noise reduction, and contrast adjustment.
- Character Recognition Options: You might be able to configure Tesseract’s character recognition settings, such as enabling or disabling specific character types (digits, punctuation, symbols, etc.).
Remember that OCR accuracy can be affected by the image quality, font type, and layout of the table, so it may require some trial and error to find the best approach.
If the configuration doesn’t help, try to use ‘tabular’ python library by using the activity ‘invoke python code’ (the idea is to extract the name as a table).
Good Luck.
Cheers.