Scanned image pdf

Can we read pdf document which contains scanned images? Pdf will have scanned image as the last page attachment and other pages will be native text.

Hi @Sachin_001

Use read pdf with ocr activity so that it helps to read the scanned pdf’s

Regards

Hi @Sachin_001 ,

You will be able to use the PDF activities to extract native text and for non native text, you can use the Apply OCR option checked.

If you just want to extract images from pdf, use the below activity

@Sachin_001

  1. Use “Read PDF with OCR”:

    • PDFPath: “path/to/your/pdf/file.pdf”
    • OCR Engine: Tesseract OCR
  2. Use “Read PDF Text”:

    • PDFPath: “path/to/your/pdf/file.pdf”
    • Page Range: “1” (or the range of pages containing native text)
  3. Use “Read PDF Text” with OCR for the last page:

    • PDFPath: “path/to/your/pdf/file.pdf”
    • Page Range: “Last” (or the page number of the last page)
    • OCR Engine: Tesseract OCR

@Sachin_001

Yes, you can read a PDF document that contains scanned images in UiPath. However, extracting text from scanned images (images that are not text-selectable) typically involves OCR (Optical Character Recognition) technology. UiPath provides activities to work with OCR engines for extracting text from images.
You Have to use Read pdf Text With Ocr Use Teseract Ocr Activity .Teseract Ocr is the Best Ocr It is a free Ocr .

If You have any queries related to ocr i will be free to answer your queries

Thank you

Can we measure the quality of scanned documents is GOOD or BAD.

As we can get less sized logistic scanned documents for extraction and make report and entry.