PDF SEGREGATION 1

Hello guys,

I have a pfd file which contain many pages (Downloaded only once in that process).
Now for each transaction i have to fetch a particular page on that pdf file based on a number.

So how can i do it.

Is there any tool. (No need of Abbyy and ocr just need to match a number from that pdf file )

@Gokul_Murali

Use read pdf text…and then use regex to check the page number

like say it contains page 1 of 25 you can use “Page \d+” as regex to get the number and then extract only number from it and check

cheers

Hi @Gokul_Murali ,

If you want to read the PDF page wise and it is in the proper page wise manner. Then you would just require to Specify which page is to be read using the Read PDF Text Activity like shown below :
image

Let say an Example,

I have an Excel input file and reading a policy number(a column in input file)

And also i have a bunch of pdf file downloaded.

Now i have to fetch that policy number from the bunch of pdf file downloaded (eg a 52 page pdf file)

The policy number is a unique number if there are 52 pdf pages there will be 52 unique policy number in each page.

So what kind of tool we can use.

@Gokul_Murali

Use read pdf each page separately and then use str.Contains”policy number”)

Cheers

Hi @Gokul_Murali ,

Could you let us know if there is a Marker that we could use for Identifying each Page separately, Like the Header and Footer is always present in every page or There is Page Number present in each Page ?

Considering that you would require the Page data where the Policy Number is present, we can use a Regex if there is a similar Pattern observed in each of the pages.