Non-searchable PDF to searchable PDF without the use of 3rd party app

Hi everyone,

One of the steps in a new business proces we are about the automate is the conversion of a non-searchable pdf to a searchable one. Is there a way to do this without the use of any third party application (like Adobe Acrobat Reader DC)? My first gues was to use the OCR activity but this gives back a string, which I cannot export to a PDF. We alreay experimented with Acrobat DC but this is not the finest application to use in combination with UiPath (same issues as already desribed on this forum too).

Thanks a lot!

This custom activity package can help you convert non-searchable pdf to a searchable one:
https://connect.uipath.com/community/project/pua-virtual-acrobat-dc-pdf-activities.

The prerequisite is that you need to have Acrobat Pro DC installed on your Robot machine.

Thank you Serena. How can I install this in UiPath. I don’t the package

You can download the package from here:
https://github.com/s3r3n3/PDF_Activties

Hi @MTS,

To make your pdf searchable using uipath, please follow the below steps:

  1. Read pdf with OCR
  2. Save extracted data from this activity.
  3. Use invoke code activity.
  4. Write below c# code to place extracted data from scanned pdf into pdf’s “Keywords” section. Once done, this will make the pdf searchable using the keywords present in pdf’s “keywords” section.

var doc = new Document();
string path = “”;
PdfReader reader = new PdfReader(path+“”);
PdfStamper stamper = new PdfStamper(reader, new FileStream(path+“”, FileMode.Create));
var info = reader.Info;
info[“Keywords”] =pdfText; where pdfText is the variable that holds the data extracted using step1
stamper.MoreInfo = info;
stamper.FormFlattening = true;
stamper.Close();
insertedWordCount = info[“Keywords”].Length;

Also, you will need to import namespace - iTextSharp.text.pdf and iTextSharp.text.xml.xmp

Hope this helps.

Regards
Sonali