Removing new page header in PDF


Hello Community!

I have uploaded 2 images of my process one is from pdf and another is after extracting the pdf’s output

Above image is the header of new page in pdf i dont want that header after extracting the pdf and my read pdf activity is placed in loop so whenever the bot will get the header in new page of pdf it should not take that header…How can i do this?

Hi @Priyesh_Shetty1 ,
READ PDF activity will read all those details included in the string output variable
–but after that we can remove those headers and footer with string manipulation method
like if we have header like “NSE…” then if we get the string output with a variable named str_input
then the expression be like this in the assign activity
str_input = str_input.Replace.ToString(“strHeader”,“”)

regards,
LNV

@Nguyen_Van_Luong1


It is showing error

Can you share that file?

@Nguyen_Van_Luong1 i cant due to company privacy.

PDF_txt = PDF_txt.replace(strHeader,“”)
can you try it

@Nguyen_Van_Luong1 i tried this then error has gone but it has not replaced that text.

@Nguyen_Van_Luong1
The activity got executed but it has not replaced that text.

I think your header it’s not correct
you can check it
other way is you can generate text after read pdf to data table
write to excel-> check it
remove all rows have value match header

Hi @Priyesh_Shetty1 ,
I try with file have footer each page,
I think with your file have header is same
my file
User_Guide_-_Language_Translator.pdf (462.7 KB)
my code
Sequence.xaml (11.8 KB)
my output before remove footer
notRemove.txt (2.6 KB)
my output after remove
Remove.txt (2.3 KB)
you can check with your file
hope it help you
regards,
LNV