Looping through PDF files to create a txt file per pdf with their names

Hello - I am new in this and I am hoping if someone can kindly help me. I am trying to loop to a folder of PDF files to extract all the text. However, I am not sure how to extract all text and store it in a txt file per PDF file. Please see the following:

@Sthefanie_Urquilla - In the Output → Text Property - Store the output to string variable.

And then Use “Write Text File” activity to write the output to text file.

image

Hope this helps…

Thank you very much for your response. I created this one. It runs but it does not give me the output txt. Yet, I do not get any error.

@Sthefanie_Urquilla - your for each value is “pdf” so you have to use the value in the read pdf activity below.

image

1 Like

Thank you for the quick response. I corrected that but for somw reason it does not create the txt.

try to print the full path where you’re saving the txt file, it might be saving someplace else.

here’s a sample workflow
pdf with ocr.zip (162.7 KB)

Thank you for your help. I keep having the same issue unfortunately :frowning:

Did the sample workflow I sent not work either?
Would you be able to share yours?

Thank you for your help. This is my project.
Contracts.zip (3.1 MB)

Your pdfFilesource is using getfolderout variable. It should instead be getfolder.

Essentially, pdfFilesource is trying to find pdfs in the wrong location, and since it returns zero files, your For Each loop doesn’t have anything to perform.

Also, remove the quotes from item variable below:

image

Change the TypeArgument for the ForEach loop to just String.
image

Finally, I noticed that you have digital PDFs. Therefore, you probably don’t need to do OCR. Just use the Read PDF activity.

This helped me so much!. I was able to run it. Thank you so much

1 Like