BLOG: My name is Anders and I’m allergic to manual work

AndersJensen · November 4, 2020, 10:09am

When we face a UiPath problem, that we can’t directly solve, our go-to approach should be to search either the UiPath Forum or Stack Overflow (VB/Pythond) for a solution (reduce work, since we don’t need to “invent” anything new).

Today I needed to split a PDF by dynamic page range and since I couldn’t find a solution, I had to create it myself (bummer ).

Do it yourself:
Even though you don’t need to split PDF’s, I can recommend doing the case, if you want a basic understanding of loops and working with files.

Case:
We have a PDF, which consist of 3 invoices. Problem: One or more of the invoices will be a 2-page invoice and we don’t know which. Sample PDF: InvoicesXYZ.pdf (59.2 KB). What we know is that our PDF’s are numbered, meaning that if it’s a one-pager, we can see a “Page 1” on the page and if it’s a two-pager, we will have a “Page 1” on the first page and a “Page 2” on the second page.

Solution Step by Step:

We create an outter For Each, where we look in our project folder for merged invoices. In our case there is only one: InvoicesXYZ.pdf. Hint: Use the .NET method Directory.GetFiles(strYourProjectPath).
Get PDF Page Count. In order to know when to stop, we find the total page count of our PDF and store it as an integer, intTotalPageNumber.
While loop. This loop will iterate through each page of our PDF. In the end of the loop we place an assign, that will add one to our index variable called intCurrentPage. The condition of the loop will then be to run as long as intCurrentPage is less than or equal to intTotalPageNumber.
Read PDF Text. We read the current page into a string variable (strTextInvoice). The range should therefore be set to intCurrentPage.
Matches. We use the Matches with a simple pattern (“Page 2”) on our string variable. The result is stored into the an IEnumerable of Match (you can see this as a collection). What we did here was to do a Regex search for “Page 2” (Steven are you watching? ). This could result in either that our IEnumerable would consist of 0 or 1 element (if we are on Page 1, we will have zero elements and vice versa).
If. We simply ask if our ienMatches is having more than 0 objects. If no we will extract our pdf “normally” meaning our Range will be the intCurrentPage.ToString. We use the output path strProjectPath + “\Result" + path.GetFileNameWithoutExtension(item) + intCurrentPage.ToString + “.pdf”, giving it a unique name. If yes we know that this is a two-pager and we therefore have to append this page to the previous extracted. We do this by setting the range to a range going from the previous page to the current page (”" + (intCurrentPage-1).ToString + “-” + (intCurrentPage).ToString) and overwrite the previous extraced PDF. Again we use the output syntax of strProjectPath + "\Result" + path.GetFileNameWithoutExtension(item) + (intCurrentPage-1).ToString + “.pdf”. Remind yourself that we use the previous page as name to overwrite the previously extracted.
Does your pages range to more than 2 pages. Easy: Create a nested While loop and solve it trivially.

Screenshot and file:

Main.xaml (8.6 KB)

Now we can post the solution to an unsolved topic on the UiPath Forum, so others won’t have to create the work: Split Pdf into multiple ones

Topic		Replies	Views
The best Youtube channel for UiPath Tutorials! Random and other categories datatable , excel , activities , string , question	6	4154	July 27, 2020
2022-wk51 Last week in RPA on YouTube Video Tutorials youtube , thelastweekinrpa	0	911	December 26, 2022
RPA Community completed	2	823	April 29, 2022
Welcome to our Forum! Please introduce yourself to our Community :) Random and other categories	1057	40200	April 19, 2024
2023-wk15 Last week in RPA on YouTube Video Tutorials studiox , youtube , thelastweekinrpa	0	574	April 17, 2023

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

BLOG: My name is Anders and I’m allergic to manual work

Related Topics