Read PDF Text Not working

I want to extract data from PDF using Read PDF Text to text file but after i tried all the pdf file. none of the PDF file results data.

The image shows a workflow with two steps: reading text from a PDF file named "Fujiprint_charge.pdf" and writing the extracted text to a file named "output.txt." (Captioned by AI)

Hi @flashdrive07

If the PDF is scanned file then use Read PDF with OCR activity

Regards,

1 Like

@flashdrive07,

Check if your PDF file is digital PDF or Scanned. From Digital PDF you should be able copy paste the text. Scanned PDF will have Image.

Use Read PDF Text activity for Digital PDF
Use Read PDF With OCR activity for Scanned PDF

Thanks,
Ashok :slight_smile:

1 Like

@ashokkarale

Is pdf OCR available for Studio X ?
without using microsoft OCR?

@flashdrive07,

Yes it is bundled in UiPath.PDF.Activities

Thanks,
Ashok :slight_smile:

@flashdrive07

The image shows an interface for reading a PDF with OCR using Tesseract OCR, indicating required fields with red "!" icons. (Captioned by AI)

We can use Tesseract OCR,UiPath Document OCR, Omni Page OCR

can you show me some basic properties of tes. OCR ?
or on how to use it ?

@flashdrive07,

Refer this documentation.

https://docs.uipath.com/activities/other/latest/ui-automation/google-ocr

Thanks,
Ashok :slightly_smiling_face:

thank you @ashokkarale

btw… can u Identify the problem here? i am running the tess ocr.

The image is an error dialog box indicating a runtime execution error in a "Single Excel Process Scope" with a message that states an OCR error due to a missing Emgu.CV assembly file or dependency. (Captioned by AI)

image

Hi Pete!

Please share the workflow and a sample document with me and we will try to reproduce the issue you are facing. E-mail me to paul.boca@uipath.com

Thank you!