How to split the pdf file basis on text name

Hi @upendra_koneru,

This use case is tricky. The example provided is dependent on how well the RegEx pattern is crafted by you. Since I do not have the pdf file, you have to provide anchors and ensure the correct page boundaries are identified - not just the key words.

The idea:

  1. Read pdf page by page with activity Read PDF Text
  2. Search the text string with activity IsMatch
  3. If a match (Boolean) is found, add a datarow containing the search text and starting page number
  4. increment page number
  5. repeat step 2
  6. If the second page is read, update the previous datarow ending page number
  7. When last page is read, update the datarow ending page number
  8. Finally Extract PDF Page Range to extract the pages.

Note: activity Assign Regex Pattern is to replace a space with \s for regular expression to work correctly. You will need to change it accordingly for the text you are searching

The example contains a sample pdf which you can test to verify the workings…
PDFExtract.zip (102.5 KB)