How to extract text from pdf files placed in a folder


#1

Hi,

  1. I extract the files using Directory.GetFiles() and tried to use Read PDF with OCR inside the “For each” item loop. With file name being dynamic since the files are extracted from the folder, how can we specify the filename in “Read PDF”?
  2. Is there an equivalent activity similar to scrape relative to fetch the values of the corresponding field since i couldn’t view scrape relative option for extracting data from pdf?

#2

Hi,
1.After extracting files from folder(Directory.GetFiles() ) pass the array inside foreach loop
then pass item inside pdf path. Hence it will iterate through all the dynamic files.
2.either you open pdf and scrape specific field using relative scrape option else after reading pdf follow these
To extract the specific value you need to find the start index and end index of the value and pass these index and get the specific value by using substring

here is the solution file