Extracting data through pdf using ocr and store in pdf uipath

Hello friends, please support me, I’m trying to pass OCR (Microsoft) to a pdf, but I can’t export in searchable pdf, I don’t want to extract the data. I want the same pdf to become searchable.

image Read PDF files - Example.zip (132.4 KB)

Thanks.
Enmanuel

Hi welcome to the community!
If you want to reconstruct the whole document as text format, that will not be possible…

you want this:

Hi @Enmanuel_D_Talla_Neg,

Welcome to the UiPath community! :smile:

The best Extraction for PDF files tool we have is the IntelligentOCR.

Take a look at it, it uses AI, machine learning and RPA to get PDF information.

And there is no other way to make the OCR pdf searchable pdf?

Hi
Welcome to uipath community

If we have the adobe license
Then these steps would help you resolve this
—use START PROCESS activity and pass the filepath of pdf file as input to Filename property
—this will open the file in foreground
—then use GET ACTIVE WINDOW activity and inside that use SEND HOT KEY activity with just key and no element chosen for selector
—first use send hot key with key as alt
—then another key as f
—then one more with key as t (short key for Export -but not sure kindly check this one alone)
—then use hot key as w and then mention another hot key that chooses Word document
Then use type into activity with input filepath and send hotbkey with enter as key

Or
If we don’t have the adobe licensed then

Did we try this component
https://go.uipath.com/component/pdf-conversion-to-microsoft-word

Cheers @Enmanuel_D_Talla_Neg

Without using uipath? … just adobe

Does CTRL + F work?

Hello, I do not have an adobe license, I will most likely use what you are recommending, how could I implement it. From pdf with ocr pass it to Microsoft Word, then Microsoft Word to PDF?

1 Like

For pdf to word
use la actividad INICIAR PROCESO y pase la ruta del archivo pdf como entrada a la propiedad Nombre del archivo
—Esto abrirá el archivo en primer plano
—Entonces use la actividad GET ACTIVE WINDOW y dentro use la actividad ENVIAR HOT KEY con solo la tecla y ningún elemento elegido para el selector
—Primero use la tecla de acceso directo con la tecla como alt
—Entonces otra tecla como f
—A continuación, una más con la tecla t (tecla corta para Exportar, pero no estoy seguro de comprobarla solo)
: Utilice la tecla de acceso rápido como w y luego mencione otra tecla de acceso rápido que elija el documento de Word
Luego use type en la actividad con la ruta del archivo de entrada y envíe la tecla de acceso rápido con enter como clave

And for word to pdf
In studio → Design tab → Manage Packages → in Official tab → search as Uipath.Word.Activity
There we have Export to pdf activity

https://docs.uipath.com/activities/docs/word-export-to-pdf

Cheers @Enmanuel_D_Talla_Neg

Read PDF files - Example.zip (132.4 KB) Hi, could you help me with some code to get a searchable pdf after passing the OCR, it just took a couple of days with uipath. I enclose the project where you could help me complete. Thank you

Hi,

Did you get its solution?

How to make pdf searchable?

Did you try reading pdf with OCR and updating its metadata?

Hi @Enmanuel_D_Talla_Neg,

To make your pdf searchable using uipath, follow the below steps.

  1. Read pdf with OCR
  2. Save extracted data from this activity.
  3. Use invoke code activity.
  4. Write below c# code to place extracted data from scanned pdf into pdf’s “Keywords” section. Once done, this will make the pdf searchable using the keywords present in pdf’s “keywords” section.

var doc = new Document();
string path = “”;
PdfReader reader = new PdfReader(path+“”);
PdfStamper stamper = new PdfStamper(reader, new FileStream(path+“”, FileMode.Create));
var info = reader.Info;
info[“Keywords”] =pdfText; where pdfText is the variable that holds the data extracted using step1
stamper.MoreInfo = info;
stamper.FormFlattening = true;
stamper.Close();
insertedWordCount = info[“Keywords”].Length;

Also, you will need to import namespace - iTextSharp.text.pdf and iTextSharp.text.xml.xmp

Hope this helps.

Regards
Sonali

1 Like

Hi

Do you have an example zip file please?

Regards
Davendra

Hi,

my apologies for reviving a thread from a year plus ago.

I had tried to run your codes but had ran into some errors. may i find out if you had created variables in UIPath before running this codes?