To Extract the data from an OMR sheet which is in JPG image format

Hi, I want to extract data from an OMR sheet which contains the native and the handwritten text. I am attaching the image below, i tried different ocr like tesseract, omnipage, google and UiPath ocr. Suggest me something.

@Arvind_Malik I tried a similar thing and the normal OCRs in UiPath can’t detect the information in similar images.

I think you should test Document Understanding / Clipboard / AI Forms. But the normal studio capabilities won’t help.

Hello @Arvind_Malik

  1. Preprocess Image:
  • Enhance image quality by adjusting brightness, contrast, or applying filters.
  1. Use OCR Engines:
  • Try OCR engines like Tesseract (with LSTM), Google Cloud Vision OCR for both native and handwritten text.
  1. OCR in Sections:
  • If possible, OCR distinct sections separately.
  1. Data Validation:
  • Validate and correct OCR results, especially for handwritten text.
  1. Consider Handwriting Recognition Tools:
  • Explore specialized handwriting recognition tools or APIs.
  1. Custom OCR Models:
  • Train custom OCR models if needed, using tools like Tesseract with LSTM training.

Thanks & Cheers!!!

Hi PeCour
i already tried with Intelligent DU capabilities, it also not work

1 Like