Pdf automation for separating the usable page

Iswarya_P1 · May 17, 2023, 7:14am

I have pdf documents with more than 100 pages. Initially, I have Cover pages and at the end some unwanted pages, I want to use that in-between pages of pdf documents. Can you please help me how to get the usable pages separately from pdf document. suggest me the technical logics or step to follow up for complete the task efficiently.

Thanks in advance.

LAKSHMI_NARAYANA_PEMMASAN · May 17, 2023, 7:20am

Manually how do you identify the these are required pages

if you don’t specify on what basis you want to bifurcate the pages from total it will be difficult to provide a logic

It will be helpful if you provide a sample pdf with expected output

Regards

Anas-p-v · May 17, 2023, 7:22am

Get the pdf page count using UiPath pdf activities. Then you can loop through the pages, read page, use regex or string manipulations to check if its a valid page, if the page is valid, split that page and save to a folder. Once iteration is completed, you will get the valid pages in a folder. Now read the folder (using directory.getfiles) and merge all the files using pdf activities.

Iswarya_P1 · May 17, 2023, 7:23am

I have a Key value for the required pages. Key value will be “LIFE TECHNOLOGIES”.This word is present in the top of every required pages.

LAKSHMI_NARAYANA_PEMMASAN · May 17, 2023, 7:25am

then follow the above approach provided by @Anas-p-v

Regards

Anil_G · May 17, 2023, 9:29am

@Iswarya_P1

A little change to the approach mentioned by @Anas-p-v

Extract pdf supports multiple pges at same time…so no need to extract each page and then join again…

Instead get all the page numbers needed and concatenate them with comma and pass it to extract pdf activity

Hope this helps

Cheers

Iswarya_P1 · May 17, 2023, 9:31am

can you give me the step by step activities usage guidance to complete.

Anil_G · May 17, 2023, 9:45am

@Iswarya_P1

Follow the steps

Initialieze a variable str of string type with string.Empty

Get pdf page count and store in a variable
Use for loop with Enumerable.Range(1,PageCount).ToArray and change type argument to integer
Inside use use read pdf text and give the page number as currentitem…
If condition with outputofpdfread.Contains(“TextTosearch”)
On then side use str = If(str.Equals(string.Empty),"",",") + currentitem.ToString
After the loop use extract pdf range and pass the str as range and it would extract all the required pages to one pdf

Cheers

Iswarya_P1 · May 17, 2023, 12:02pm

I got an error in that str = If(str.Equals(string.Empty),“”,“,”) + currentitem.ToString assign acitivity. can you explain that line.

Anil_G · May 17, 2023, 12:10pm

@Iswarya_P1

This like is to append page numbers with comma separation …for the first time we should not add comma …so check if string is empty then append page number …else append a comma and page number

Small change …I forgot to include this

If(str.Equals(string.Empty),"",str + ",") + currentitem.ToString

Cheers

Anil_G · May 17, 2023, 1:15pm

@Iswarya_P1

Please check this…all of your workflow is working perfect…the only step you missed is initializing str

Happy Automation

BlankProcess30 (2).zip (1.5 MB)

cheers

Iswarya_P1 · May 17, 2023, 4:41pm

Thank you for your support. @Anil_G

Anil_G · May 17, 2023, 4:54pm

@Iswarya_P1

Happy Automation

Cheers

Iswarya_P1 · May 18, 2023, 6:18am

For example, let’s assume that the pdf documents have 100 pages, the last 30 pages are strike out.(total Strike out pages may vary for each documents) In these case, How to identify the strike out pages in the pdf documents and get the remaining pages from the documents for pdf automation? @Anil_G

Anil_G · May 18, 2023, 6:23am

@Iswarya_P1

Please open a separate topic for this as this is not related to the current one…

This helps in segregating the issues one for one topic

You can close this is the current issue is resolved

It would be a good idea if you can attach a sample also in the new thread you create

cheers

system · May 21, 2023, 6:23am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to split a PDF Document into separate pages? Help activities	4	1451	August 28, 2020
Split the pdf document based on key value Activities pdf , question	2	430	May 17, 2023
Pdf automation-remove the last few pages Activities pdf , question	3	490	May 19, 2023
How to split pdf pages and extract? Help pdf , activities , question	4	17304	September 25, 2020
How to Extract PDF page fully which contains particular Data as Validation Activities pdf , studio , question , activities_panel	4	699	June 24, 2023

Pdf automation for separating the usable page

Related topics