Split PDF based on words

Hello everyone!

Anyone here who tried splitting pdf based on words? For example, I have a 10-page pdf, and if the word, “Sheet” is in a page, let say in page 3 and 8, it will be extracted to 3 pdfs (1st contains page 1-2, 2nd contains 3-7 and 3rd, page 8-10)? Can anyone give me a sample xaml?

@cldt - Yes, This is doable. I will look for sample xaml and share it with you.

1 Like

@cldt - Here you go…

In this sample workflow, I have a pdf with 22 pages. I am looking for the word Future which is present in page 2, 6 and 15. So the split ranges would be, 1-1, 2-5, 6-14, 15-22. I am building a string based on the pages matched and finally passing that to “PDF Splitter” activity.

image

Output Folder
image

Workflow: Split_PDFs.zip (1.5 MB)

Note: I have used Balareva PDF activity package to split the pages. If your organization does not allow third party packages you can use “Join PDF” activity instead of that.

Hope this helps…

1 Like

If you want to do it in looping:

  1. Read the pdf and save it in a string variable
  2. split the string using new line into an array
  3. loop the array until contains 1-2
  4. when found, do the handling you want to, eg. store these words into another variable, etc.
1 Like

Thanks! This is perfect!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.