Unable to read nor copy&paste PDF generated by PDF printer


#1

Hi,

I have the the requirement to split or merge PDFs and have the PDFCreator installed on my machine.

However, I found that whenever I split a PDF or merge several PDF, the output PDF is visually OK, but when you try to read the text by UiPath or simply just copy & paste, the value of the character is unreadable.

image

e.g. Merged a few PDF reports into a bigger file then try to read the “Handling AE” text, the value becomes “, ea-”

Thought it is encoding issue, but tried to convert for different encoding combinations but all failed, online encoding detection tool also failed.

Btw, using the same printer to print as PDF from any other source file (web, doc, ppt, xls) just works fine.

Any idea?

Regards,
Ben Wong


#2

@ben.wong What is the UiPath version you are using?
You could try setting pdf properties. Click Ctrl+K on pdf window. In the preferences window try setting “Use Overprint Preview” and Show reference XObject targets" to “Always”

Also check-Unable to Identify Elements in Machine Readable PDF


#3

@Madhavi thanks for your suggestion but it doesn’t work.

I tried 3 different PDF printers (PDFCreator, Microsoft Print to PDF and PrimoPDF) all with the same issue.

Attached a sample file. First one, generated from printing an Excel into PDF, the second one, generated from printing the first PDF again into another PDF (real life example will be split or merge PDFs).
First.pdf (78.2 KB)

Second.pdf (79.3 KB)

You can see they look exactly the same, but when you try to scrape or just manually copy any character from Second.pdf, you will see the value are corrupted characters.

Probably not a problem of UiPath, but would also like to know how to handle it if anyone has some ideas.
Thanks.


#4

@ben.wong

Below are the read results for the pdfs you have sent.results.zip (1.2 KB)

The same results i am getting when i am scraping or copying the result manually.


#5

@ben.wong

Please refer the issue logged at Adobe Form https://forums.adobe.com/thread/427945

For them issues is resolved by using CutePDF printer.


#6

@Madhavi

Thanks for the Adobe reference and I will try the CutePDF approach, though I couldn’t install additional virtual PDF printer on my client nor company machine but will try in my own laptop and see if can grant approval from client to install that on their environment.