PDF encoding issues following DU process

Hi there,

I’ve been having issues at work where a small number of PDFs have been either losing additional text fields or coming up garbled following document validation. What’s curious is the document validation itself presents the data correctly, and it goes to a performer just fine, it’s only the archived PDF file that is affected.

When I inspect the fonts in the PDFs, I can see very clearly they’ve been converted from their prior version (Type 1 or Type 3) to TrueType, and their file sizes also change, which makes me believe UiPath is saving the PDF again at some point. I cannot see where that would be in the flow, so I wanted to ask if the UiPath Document OCR might be rebuilding the PDF or if there’s any other part of the process that could be overwriting the original file, and if so, how to troubleshoot.

Thanks for the help!

Eóin

Hello @eoin.dooley

The UiPath OCR engine does not modify the original PDF, but Validation Station or downstream activities can.

Your issue most likely comes from either:

  1. The “Save Validated Document” setting in Validation Station, or
  2. Some hidden PDF manipulation activity in your workflow.
    a. Compare file hashes and timestamps to pinpoint where the file is being rebuilt.
    b. If you only need to archive the original, avoid letting UiPath open the PDF for editing — just copy it as a binary file.

@eoin.dooley

Welcome to the community

Ideally nothing would change it ..are you uploading to action center and taking from there?

Can you elaborate or show your flow

When you say type changes where do ypu see that change?

Cheers