Folder - looping files

loop
ocr

#1

Hello,

I have the following problem: I would like to have all the pdf in the folder “A” OCRed using Microsoft OCR engine, written in a text file in the folder B and then move pdf files from folder A to folder C to distinguish them from those I have already OCRed.

Since it is rather irrelevant if a single .pdf file gets copied after OCRing or all of them are copied at the end - I chose the easier (to my mind) 2nd option. I tried copying without OCRing first and it worked. I also tried to run this process with one file and also went fine - therefore I must have messed something up with the loop.

I started with assigning a new variable (as it was explained in a similar topic)

Then, I used ‘for each’ loop and had pdfs OCRed

Finally, I decided to assign a new variable of a text file (since all of them need to have a different names). I wanted .txt file to be named exactly like .pdf one, so I created new variable name = “Path”+item.ToString+".txt".

There, I encountered problem:

Could you please tell me what went wrong and what to write in a ‘File name’ and ‘Text must be quoted’ sections of ‘Write Text File’?

Thank you :slight_smile:


#2

Can you check the value of Item, i think it will have the entire path+filename+extension

``to get filename you can use :ilename= Path.GetFileNameWithoutExtension(item)`

to get path you can use : Path = Path.GetDirectoryName(item).

“Path”+filename.ToString+".txt". — This should go in file name

Miscrosoft OCR Output ----- string variable should go into Textmust be quotedemphasized text


#3

I tried that and while the process is working, I see no documents ocred in a given folder.
Do you know why is that so?


#4

What do you mean by this? The output is empty?


#5

Yes, the .txt file is there, but when I open it - it is nothing inside


#6

Try doing with a single file outside your code and see if the result is the same.


#7

Same thing. There must be something wrong with the OCR engine settings…
That’s weird, because it used to work, when I tried before.