Hello friends, please support me, I’m trying to pass OCR (Microsoft) to a pdf, but I can’t export in searchable pdf, I don’t want to extract the data. I want the same pdf to become searchable.
If we have the adobe license
Then these steps would help you resolve this
—use START PROCESS activity and pass the filepath of pdf file as input to Filename property
—this will open the file in foreground
—then use GET ACTIVE WINDOW activity and inside that use SEND HOT KEY activity with just key and no element chosen for selector
—first use send hot key with key as alt
—then another key as f
—then one more with key as t (short key for Export -but not sure kindly check this one alone)
—then use hot key as w and then mention another hot key that chooses Word document
Then use type into activity with input filepath and send hotbkey with enter as key
Hello, I do not have an adobe license, I will most likely use what you are recommending, how could I implement it. From pdf with ocr pass it to Microsoft Word, then Microsoft Word to PDF?
For pdf to word
use la actividad INICIAR PROCESO y pase la ruta del archivo pdf como entrada a la propiedad Nombre del archivo
—Esto abrirá el archivo en primer plano
—Entonces use la actividad GET ACTIVE WINDOW y dentro use la actividad ENVIAR HOT KEY con solo la tecla y ningún elemento elegido para el selector
—Primero use la tecla de acceso directo con la tecla como alt
—Entonces otra tecla como f
—A continuación, una más con la tecla t (tecla corta para Exportar, pero no estoy seguro de comprobarla solo)
: Utilice la tecla de acceso rápido como w y luego mencione otra tecla de acceso rápido que elija el documento de Word
Luego use type en la actividad con la ruta del archivo de entrada y envíe la tecla de acceso rápido con enter como clave
And for word to pdf
In studio → Design tab → Manage Packages → in Official tab → search as Uipath.Word.Activity
There we have Export to pdf activity
Read PDF files - Example.zip (132.4 KB) Hi, could you help me with some code to get a searchable pdf after passing the OCR, it just took a couple of days with uipath. I enclose the project where you could help me complete. Thank you
To make your pdf searchable using uipath, follow the below steps.
Read pdf with OCR
Save extracted data from this activity.
Use invoke code activity.
Write below c# code to place extracted data from scanned pdf into pdf’s “Keywords” section. Once done, this will make the pdf searchable using the keywords present in pdf’s “keywords” section.
var doc = new Document();
string path = “”;
PdfReader reader = new PdfReader(path+“”);
PdfStamper stamper = new PdfStamper(reader, new FileStream(path+“”, FileMode.Create));
var info = reader.Info;
info[“Keywords”] =pdfText; where pdfText is the variable that holds the data extracted using step1
stamper.MoreInfo = info;
stamper.FormFlattening = true;
stamper.Close();
insertedWordCount = info[“Keywords”].Length;
Also, you will need to import namespace - iTextSharp.text.pdf and iTextSharp.text.xml.xmp