PDFs that are not encrypted unable to be read

chris.bartkewicz · April 7, 2025, 9:11pm

I have a PDF that opens a form with a large table. I need to extract one number from the PDF. The PDF is not encrypted but when I do a test “read PFD text” activity and print to a message box, the strings come out looking like encrypted chars with shapes.

Am I doing something wrong, or is this just a result of the formatting or font on the PDF? It shouldnt matter if I am reading PDF only as text right? It still should yield all strings in order, regardless of formatting right?

Yoichi · April 7, 2025, 11:28pm

Hi,

To isolate cause, can you try to copy text from the PDF file using some PDF viewer such as adobe reader, chrome browser etc., then paste it into notepad? If it’s not correct text, the pdf may be applied something font based copy protect.

Regards,

prashant1603765 · April 8, 2025, 3:01am

Hi @chris.bartkewicz

it’s likely due to non-standard fonts or scanned content. In such cases, using the “Read PDF with OCR” activity in UiPath is more effective than “Read PDF Text.” Try different OCR engines like Tesseract or Microsoft OCR to improve results.
If the PDF has a complex layout or table, consider using Document Understanding and the “Digitize Document” activity, followed by regex or form extractors to pull the specific number you need.

If you found helpful please mark as a solution. Thanks
Happy Automation with UiPath

Topic		Replies	Views
Read PDF Text Only extract certain data Activities pdf , activities , question	3	756	March 12, 2021
Problem reading PDF Help pdf , activities , question	3	782	December 30, 2020
Pdf passwords decrypt Studio uiautomation , pdf , activities , studio , question , pdf-extraction	5	651	January 17, 2024
Read PDF Text Error Studio studio , question , output_panel	8	869	February 16, 2021
Unable to read PDF Files Help pdf , ocr , activities	9	7162	October 29, 2018

PDFs that are not encrypted unable to be read

Related topics