I need to split pdf into multiple pdfs. i had no page numbers in it.Based on the text i need to split into multiple pdfs.can any one help?I had extracted pdf data and tyring to split by regex

I need to split pdf into multiple pdfs. i had no page numbers in it.Based on the text i need to split into multiple pdfs.can any one help?I had extracted pdf data and tyring to split by regex

2 Likes

Hi @MitheshBolla

Some further information would be useful. Did you need help with creating the regex?

For your problem I would firstly extract all text, perform necessary manipulation, use UiPath.Word.Activities to write into a word file for each separate PDF & UiPath.PDF.Activities to Export the word files to PDF.

Kind Regards

1 Like

Hey @MitheshBolla

Even though you don’t have the page numbers, I believe you have those PDF in multiple pages which is fair enough.

ReadPDFText Activity

Ref - Read PDF Text

Also enable the preserve formatting as well.

Write it back to a word document as required & Publish it as PDF.

Hope this helps.

Thanks
#nK

1 Like

if there are 1000 pages can i break into each page ?

1 Like

I am reading pdf with ocr text. if i read pdf with normal text its not working

1 Like

Hey @MitheshBolla

Why not… you can do it please.

You can even perform that dynamically,

Ref - Get PDF Page Count

Thanks
#nK

Hey @MitheshBolla

No issues, Read PDF With OCR also has the page range prop where you can pass required page numbers.

Thanks
#nK

Read PDF activity will only work with native PDFs, if your PDFs are not native the text must be extracted using OCR, such as Read PDF with OCR

1 Like

My pdf contais 100 pages and i need to split into 100 pdfs ie: each page as pdf. which activity to use

yes u r right

Hello @MitheshBolla,

I think you can do it in this way:

image

PD: Instead of using log message use write text document and write it as .pdf

Hope it helps! :slight_smile:

2 Likes

This worked, if i had 100 pages , iam able to split into each pages.

now i need to divide each 2 pages in one pdf!
image
what input i need to give here? so that i get 50 pdfs with each pds contains 2 pages

i did this, but in extracted pdf i am getting only 2 ,4 ,8 page sheet

but no
1 and 2 page in first pdf
3,4 in 2nd pdf
5,6 in 3rd pdf. i need to extract like this

Try:

Range = 2

In PDF Range: (Range -1).ToString + “-” + Range.ToString

Range = Range + 2

1 Like

In this way I think it will work because you will start by page 2 and pring minus one and page 2 (you get page 1 and 2). Then if you add +2, you get range = 4. Then you will print 3-4, etc

1 Like

image
getting error.i am using c#

In VB is working:

1 Like

It seems that the “-” is not well written. I do not use C#, so I am not pretty sure how it should be written…

1 Like

Maybe (Counter-1).ToString() instead of (Counter-1).ToString () ?

1 Like

So, based on pages i should change only range right, in my case its counter,and everything is same.