Split PDF on string matching

Hi All,

I need you help.

Scenario-
I have to split the pdf on matching string, lets say I have a pdf where there is a sentence which is common in every pdf file i am processing, i wanna split that pdf once the sentence is matching and create a new pdf keep all the pages upto that only.

As the number of page is not fixed in the pdf so i have no option to hardcore the page number.

Thank you

Hi Indrajit,

If you want to split pdf text with specific string,
then you can try
Split(pdfText,splitString)

thank you
Debakanta

Thank you @Debakanta_Mahanta , here i wanna split the pdf based on string matching.

Eg- lets se I have a pdf with 10 pages and the string I am trying to match ‘Abcd’ in on page 4, then what i wanna do is to generate a new pdf keeping pages from 1 -4 and drop the rest.

Hi @indrajit.shah ,

Using the Highlighted Activities below from the PDF Activities Package, we should be able to achieve your requirement.
image

  1. Use Get PDF Page Count Activity.
  • Retrieve the Page Count of the PDF Pages, Store the output in a integer variable say pageCount
    image
  1. Using a For Each Activity we can Loop through Each Page of PDF and Extract it’s contents.
  • Using a Read PDF Text Activity, extract the text data.
  • Using an If Activity, Check if the Page data contains matching string.
  • If it does , save the Page No. to another variable, PageNo, Break out of the Loop.
  • If it does not, update page no. as -1
    image
  1. Next, Out of the For Loop, Use an If Activity and Check Page No is not Equal to -1. And then Extract the Pages of the PDF using Extract PDF Page Range Activity

This should provide you with the PDF file upto the Matching String value.

Let us know if you are facing any difficulties.