PDF SEGREGATION 1

Gokul_Murali · February 3, 2023, 12:25pm

Hello guys,

I have a pfd file which contain many pages (Downloaded only once in that process).
Now for each transaction i have to fetch a particular page on that pdf file based on a number.

So how can i do it.

Is there any tool. (No need of Abbyy and ocr just need to match a number from that pdf file )

Anil_G · February 3, 2023, 12:27pm

@Gokul_Murali

Use read pdf text…and then use regex to check the page number

like say it contains page 1 of 25 you can use “Page \d+” as regex to get the number and then extract only number from it and check

cheers

supermanPunch · February 3, 2023, 12:38pm

Hi @Gokul_Murali ,

If you want to read the PDF page wise and it is in the proper page wise manner. Then you would just require to Specify which page is to be read using the Read PDF Text Activity like shown below :

Gokul_Murali · February 3, 2023, 12:47pm

Let say an Example,

I have an Excel input file and reading a policy number(a column in input file)

And also i have a bunch of pdf file downloaded.

Now i have to fetch that policy number from the bunch of pdf file downloaded (eg a 52 page pdf file)

The policy number is a unique number if there are 52 pdf pages there will be 52 unique policy number in each page.

So what kind of tool we can use.

Anil_G · February 3, 2023, 1:04pm

@Gokul_Murali

Use read pdf each page separately and then use str.Contains”policy number”)

Cheers

supermanPunch · February 3, 2023, 1:54pm

Hi @Gokul_Murali ,

Could you let us know if there is a Marker that we could use for Identifying each Page separately, Like the Header and Footer is always present in every page or There is Page Number present in each Page ?

Considering that you would require the Page data where the Policy Number is present, we can use a Regex if there is a similar Pattern observed in each of the pages.

Topic		Replies	Views
How get policy numbers from one pdf contains 2000 pages Help	16	1443	October 6, 2019
How to loop at each page in a pdf file looking for text or digitize? Studio uiautomation	11	3681	May 24, 2021
Reading downloaded pdf and saving Studio studio , question , activities_panel	1	457	April 18, 2023
Exctract specific date from diffrent font dynamic scanned PDF Document Understanding	4	1555	July 12, 2020
Take a specific piece of text and the page it is on, from a PDF file Activities pdf , activities , question	11	1399	April 14, 2023

Most Active Users - Yesterday
Yoichi
Anil_G
SorenB
sven.wullum1
jast1631
takehiro.ichikura
sharazkm32
A_Learner
ashokkarale
pradeep-shukla
More details...

PDF SEGREGATION 1

Related topics