Splitting a pdf document

sraar.jans-beken · April 29, 2021, 4:56pm

I have a pdf document. Of 10 pages for example.
On every page is a certain text-string, typically on 2 pages the same string.
The task at hand: Split the pdf in several pdf documents, where the 2 pages with the same string need to be in the same file.
As an extra: The string to look for is in an Excel sheet (for each row) and if there are pages that have a string that is not in that Excel sheet, they don’t need to be saved as a seperate document.
I found an example (Need to read a word in PDF file and if that word exists should remove that page and save the other pages - #7 by prasath17), installed BalaReva.Pdf.Activities. But I don’t see the “For Each” loop action to add (I have StudioX, not Studio)

prasath17 · April 29, 2021, 5:18pm

Hi @sraar.jans-beken - Recently(few weeks back) helped a member on the similar request, where the text to look for in the pdf say “invoice” …if it found on page 3, 5, 7 then i splitted the pdfs into 4 parts.

Page 1-2, 3-4, 5-6, 7-10 like this. I buit this string with this value (1-2, 3-4, 5-6, 7-10) and then finally passed to pdf splitter (BalaReva) …

But I am confused about your case, could you please brief with some example and possibly share the screenshot of the excel file?

But I am not sure, how to do this in StudioX…but we can try…

sraar.jans-beken · May 4, 2021, 6:42am

First of all thanks for your answer.

For now, please ignore the Excel part of my question. I added an example PDF. As you will notice page 1 & 2 both have the same text string (ABC123). The same for page 3 & 4 (DEF456), and do on.

Task at hand: Go through the pdf pages, and create a new pdf for every page with text string DEF456 (being page 2 & 3).

When done, the original document can be deleted, and a 2-page document should be saved.

Example.pdf (57.5 KB)

prasath17 · May 4, 2021, 1:06pm

@sraar.jans-beken - Please check this workflow…Split_PDF.zip (361.9 KB)

you can delete the files from the extracted and Merged folder and then try running the workflow, you will pdf pages with DEF456 splitted first and then merged.

Note: Only downside of this approach is the size. if you notice the size of the merged pdf
is greater than the original pdf size.

Hope this helps…

sraar.jans-beken · May 10, 2021, 4:35pm

The issue is that I don’t have a “for each item” control in StudioX

prasath17 · May 11, 2021, 11:32am

@sraar.jans-beken - what version of UiPath you are using ?

sraar.jans-beken · May 11, 2021, 1:41pm

Version:

StudioX 2020.10.4

Enterprise License

Windows Installer

prasath17 · May 11, 2021, 1:59pm

@sraar.jans-beken - See if you can update …

or if update is not possible I will have to think on how to do this with out for each item…

I just tried to downgrade the system activities to 20.10.4

I could see the Repeat No of Items in the common tab…please check

Topic		Replies	Views
How to split pdf acording to word Studio studio , question , activities_panel	4	642	June 27, 2022
Split pdf based on a word Studio studio , question , activities_panel	4	687	August 15, 2022
How to split the pdf file pages and store in another folder Help studio	2	844	July 29, 2019
Need to read a word in PDF file and if that word exists should remove that page and save the other pages Studio uiautomation , pdf , chrome , studiox , question	19	2112	January 14, 2021
Hi Experts , I want to split pdf file in single page Studio studio , question , activities_panel	8	187	December 12, 2023

Most Active Users - Yesterday
ashokkarale
Anil_G
Yoichi
yangyq10
postwick
chandreshsinh.jadeja
aravindbalineni123
Parvathy
aya
PRASHANT_GABHANE
More details...

Splitting a pdf document

Related Topics