I am trying to find which pages a certain string of text appears in a group of PDFs. So far my process is:
- Get PDF files from directory
- For each loop through each PDFs and use Read PDF Text to write the PDF text to a variable
- Identify if a string of text or formatting of text occurs in the PDF (I think this can be done via RegEx or other string functions)
- Identify which pages of the PDF match the text criteria
So far, I have no trouble reading the PDF to text and I have found a method to get the total number of pages in a PDF but cannot figure out how to identify which pages this text occurs on.
Can anyone help me identify where in the PDF my text occurs? The PDFs will have varying page lengths and the pages containing the text will appear on varying pages.