Hi forum,
I have one requirement where I have many pdf files and I wanted to check if it has multiple pages in it. If so, then I wanted to split them into different files.
How do I do this in UiPath? Please help…
Hi forum,
I have one requirement where I have many pdf files and I wanted to check if it has multiple pages in it. If so, then I wanted to split them into different files.
How do I do this in UiPath? Please help…
Follow this solution. Just change the page threshold as per your requirement.
@lrtetala
How can I split the pdf’s based on the specific keyword?
You can follow this:
Read the PDF File: Use the Read PDF Text
or Read PDF with OCR
activity to extract the text from the PDF document.
Identify the Keyword: Determine the keyword based on which you want to split the PDF.
Process the Text: Once you have the full text extracted, you can analyze it to find the occurrences of the keyword. You can use string manipulation functions or regular expressions to identify the positions of the keyword.
Create Separate PDFs: For each section that contains the keyword, use the Split PDF
activity to create a new PDF document.
Save the New PDFs: Finally, output each split PDF to your desired directory.
Here’s a step-by-step implementation example with some basic UiPath activities:
Read PDF Text Activity:
Read PDF Text
activity to read the PDF file and output the text to a string variable (e.g., pdfText
).Find Keyword Positions:
IndexOf
method to find occurrences of your specific keyword in the pdfText
.Dim keyword As String = "YourKeyword"
Dim positions As New List(Of Integer)
Dim index As Integer = pdfText.IndexOf(keyword)
While index <> -1
positions.Add(index)
index = pdfText.IndexOf(keyword, index + keyword.Length)
End While
Split the Text into Sections:
pdfText
into sections.Create PDF Documents:
Write PDF
activity or by creating a new PDF with the PDF Activities
.Save the PDFs:
Write PDF
activity or appropriate methods to save each new PDF to a desired location.Here’s an example of a simple logic using UiPath activities. You may need to adapt it depending on your specific requirements:
' Assuming you already read the pdfText into a Data Variable called pdfText
Dim keyword As String = "YourKeyword"
Dim sections As New List(Of String)
Dim positions As List(Of Integer) = GetKeywordPositions(pdfText, keyword) ' Function to get positions
For i As Integer = 0 To positions.Count - 1
Dim start As Integer = positions(i)
Dim end As Integer = If(i + 1 < positions.Count, positions(i + 1), pdfText.Length)
sections.Add(pdfText.Substring(start, end - start))
Next
' Now create PDF for each section
For Each section In sections
' Use Write PDF activity here with each section
Next