Split PDF on string matching

Hi All,

I need you help.

I have to split the pdf on matching string, lets say I have a pdf where there is a sentence which is common in every pdf file i am processing, i wanna split that pdf once the sentence is matching and create a new pdf keep all the pages upto that only.

As the number of page is not fixed in the pdf so i have no option to hardcore the page number.

Thank you

Hi Indrajit,

If you want to split pdf text with specific string,
then you can try

thank you

Thank you @Debakanta_Mahanta , here i wanna split the pdf based on string matching.

Eg- lets se I have a pdf with 10 pages and the string I am trying to match ‘Abcd’ in on page 4, then what i wanna do is to generate a new pdf keeping pages from 1 -4 and drop the rest.

Hi @indrajit.shah ,

Using the Highlighted Activities below from the PDF Activities Package, we should be able to achieve your requirement.

  1. Use Get PDF Page Count Activity.
  • Retrieve the Page Count of the PDF Pages, Store the output in a integer variable say pageCount
  1. Using a For Each Activity we can Loop through Each Page of PDF and Extract it’s contents.
  • Using a Read PDF Text Activity, extract the text data.
  • Using an If Activity, Check if the Page data contains matching string.
  • If it does , save the Page No. to another variable, PageNo, Break out of the Loop.
  • If it does not, update page no. as -1
  1. Next, Out of the For Loop, Use an If Activity and Check Page No is not Equal to -1. And then Extract the Pages of the PDF using Extract PDF Page Range Activity

This should provide you with the PDF file upto the Matching String value.

Let us know if you are facing any difficulties.

Hi @indrajit.shah

Did you get exact solution for this issue?

Please let me know if you have solution for this.

Hi @supermanPunch

Thanks for providing this helpful solution. but i am facing issue while executing the flow. it is extracting only single page.

If you have video of this flow please share with us.