How can I read all PDFs from a folder into TXT files?

Hello!

I would like to OCR all PDFs in a folder and write them all to separate txt files.

Can you help me?

@fekete.petra

Welcome to the community

  1. Use for each file in folder activity and give the folder path
  2. Inside the loop use read pdf with ocr and store it to variable str(Currentfile.Fulname will give the filepath)
  3. Use write text file activity and then write the string variable(use currentfile.fullname.Replace(".pdf",".txt"))

Hope this helps

Cheers

Hi @fekete.petra Check below workflow

SampleProcess.zip (31.0 KB)

Thank you!
I get the following error: “Cannot assign from type ‘System.String[]’ to type ‘System.String’ in Assign activity ‘Assign’.

OCR_auto.zip (50.0 KB)

@fekete.petra

Did you happen tot ry above method?

And this error says that you are trying to assign a areay if strings to a string type of variable…please change the datatype of the variable accordingly

Cheers

I have fixed the issue. Check the updated workflow below

For variable named Files you have declared the data type as String but it should be Array Of Strings

Since you are trying to read the multiple files from the folder, the variable type should be Array Of Strings

OCR_auto.zip (100.2 KB)