I have pdf documents with more than 100 pages. Initially, I have Cover pages and at the end some unwanted pages, I want to use that in-between pages of pdf documents. Can you please help me how to get the usable pages separately from pdf document. suggest me the technical logics or step to follow up for complete the task efficiently.
Thanks in advance.
Manually how do you identify the these are required pages
if you don’t specify on what basis you want to bifurcate the pages from total it will be difficult to provide a logic
It will be helpful if you provide a sample pdf with expected output
Get the pdf page count using UiPath pdf activities. Then you can loop through the pages, read page, use regex or string manipulations to check if its a valid page, if the page is valid, split that page and save to a folder. Once iteration is completed, you will get the valid pages in a folder. Now read the folder (using directory.getfiles) and merge all the files using pdf activities.
I have a Key value for the required pages. Key value will be “LIFE TECHNOLOGIES”.This word is present in the top of every required pages.
then follow the above approach provided by @Anas-p-v
A little change to the approach mentioned by @Anas-p-v
Extract pdf supports multiple pges at same time…so no need to extract each page and then join again…
Instead get all the page numbers needed and concatenate them with comma and pass it to extract pdf activity
Hope this helps
can you give me the step by step activities usage guidance to complete.
Follow the steps
Initialieze a variable str of string type with string.Empty
- Get pdf page count and store in a variable
- Use for loop with
Enumerable.Range(1,PageCount).ToArray and change type argument to integer
- Inside use use read pdf text and give the page number as currentitem…
- If condition with outputofpdfread.Contains(“TextTosearch”)
- On then side use
str = If(str.Equals(string.Empty),"",",") + currentitem.ToString
- After the loop use extract pdf range and pass the str as range and it would extract all the required pages to one pdf
I got an error in that str = If(str.Equals(string.Empty),“”,“,”) + currentitem.ToString assign acitivity. can you explain that line.
This like is to append page numbers with comma separation …for the first time we should not add comma …so check if string is empty then append page number …else append a comma and page number
Small change …I forgot to include this
If(str.Equals(string.Empty),"",str + ",") + currentitem.ToString
Please check this…all of your workflow is working perfect…the only step you missed is initializing str
BlankProcess30 (2).zip (1.5 MB)
Thank you for your support. @Anil_G
For example, let’s assume that the pdf documents have 100 pages, the last 30 pages are strike out.(total Strike out pages may vary for each documents) In these case, How to identify the strike out pages in the pdf documents and get the remaining pages from the documents for pdf automation? @Anil_G
Please open a separate topic for this as this is not related to the current one…
This helps in segregating the issues one for one topic
You can close this is the current issue is resolved
It would be a good idea if you can attach a sample also in the new thread you create
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.