I need to find the number of the page where a keyword is located in a pdf. Do you know any activity that gives me the total number of pages that a PDF file has?

I have to extract a list of pdf files and check if there is more than one invoice inside, if so, I must extract by page rank the pages that correspond to each invoice and save them in independent pdf files, now, I thought I would do it by indetifying a keyword, which is unique for each payment, and check the number of the page where it is to extract only the necessary gaps as the case may be, my solution depends on finding an activity that tells me the total number of pages that each PDF has and then search the key word and print the ranges of necessary pages, the problem is that not if there is an activity that gives me the total number of pages. Any recommendation?

Hi @Renenobal

Try to use strvar=directory.getfiles ("")
For each item in strvar

Use read pdf text and mention the range and use out put variable based on that set a condition as stroutput.ToString.contains(“keyword”)

Thanks
Ashwin.S

1 Like

Create instance of stream reader
Sr= new StreamReader(path.trim)
MatchesVar= regex.Matches(sr.ReadtoEnd(), “/Type\s*/pages[^s]”)

Matchesvar.count.tostring

You can try something like this. It will give count of pages

2 Likes

Thank you very much, I will try with this solution, I think it could work perfectly, then I’ll tell you how it was

I do not think I have the skills to do it this way, if you have an example I could try it, but I honestly do not understand the solution

PdfPageCount.zip (422.1 KB)
Hope it will help

1 Like

I was trying but I still do not identify the page number where the keyword is located. Is there any activity that gives me the total number of pages in the document?

I’ve seen this error several times lately and I do not know why

Sorry but i am not aware of it or how to deal with it … Wf is running fine for me.

Get this package and there’s an activity that will output the number of PDF pages in a pdf file.

1 Like

Thank´s, it work perfectly, now a just need indentify the page number that contains my key word

1 Like

Unfortunately the pdf activity package is now out of date.

If you believe that this should be an activity for finding the total page count, please vote here: PDF Page Count Activity

Hello Tushar

below solution helps to search one keyword at time through start to end of PDF
Sr= new StreamReader(path.trim)
MatchesVar= regex.Matches(sr.ReadtoEnd(), “/Type\s*/pages[^s]”)

could you please help me to code dynamic solution,
where search criteria is array of multiple elements and need to search at one time(in on GO) when document is being read from start to end in on GO.

My problem is details explained in below post …

Will really thanks full if we come up with solution