Splitting a pdf document

I have a pdf document. Of 10 pages for example.
On every page is a certain text-string, typically on 2 pages the same string.
The task at hand: Split the pdf in several pdf documents, where the 2 pages with the same string need to be in the same file.
As an extra: The string to look for is in an Excel sheet (for each row) and if there are pages that have a string that is not in that Excel sheet, they don’t need to be saved as a seperate document.
I found an example (Need to read a word in PDF file and if that word exists should remove that page and save the other pages - #7 by prasath17), installed BalaReva.Pdf.Activities. But I don’t see the “For Each” loop action to add (I have StudioX, not Studio)

Hi @sraar.jans-beken - Recently(few weeks back) helped a member on the similar request, where the text to look for in the pdf say “invoice” …if it found on page 3, 5, 7 then i splitted the pdfs into 4 parts.

Page 1-2, 3-4, 5-6, 7-10 like this. I buit this string with this value (1-2, 3-4, 5-6, 7-10) and then finally passed to pdf splitter (BalaReva) …

But I am confused about your case, could you please brief with some example and possibly share the screenshot of the excel file?

But I am not sure, how to do this in StudioX…but we can try…

First of all thanks for your answer.

For now, please ignore the Excel part of my question. I added an example PDF. As you will notice page 1 & 2 both have the same text string (ABC123). The same for page 3 & 4 (DEF456), and do on.

Task at hand: Go through the pdf pages, and create a new pdf for every page with text string DEF456 (being page 2 & 3).

When done, the original document can be deleted, and a 2-page document should be saved.

Example.pdf (57.5 KB)

@sraar.jans-beken - Please check this workflow…Split_PDF.zip (361.9 KB)

you can delete the files from the extracted and Merged folder and then try running the workflow, you will pdf pages with DEF456 splitted first and then merged.

Note: Only downside of this approach is the size. if you notice the size of the merged pdf
is greater than the original pdf size.

Hope this helps…