How to get only numbers from PDF file?

chinmay_dhabal · March 7, 2017, 8:57am

Suppose there is a pdf containing

Name: xyz
Phone number: 9663743

i want to get “9663743” i.e numbers .

so this program should be able to extract all the numbers present in the pdf…

I tried with “Number Only” option from Dropdown menu of Google OCR but that is giving different result

palindrome · March 7, 2017, 10:21am

-Get the content of the pdf into string using “Read Pdf text activity”.
-Use “Matches” activity with input string as pdf string and pattern as “\d{length}”
replace length as the length of numbers you want to extract ie for 9663743 regex is “\d{7}”
if length is 7 or more digits use expression "\d{7,}
-output will be the IEnumberable of Match, you can get each value from the ienumerable using the for each actvity with type as Match.

Madhuraj · March 7, 2017, 10:23am

@chinmay_dhabal

Read the entire PDF Text and extract number values from the same.

Please find below 2 methods

Hope it helps…

Regards
Madhura

chinmay_dhabal · March 7, 2017, 11:54am

Thanks @palindrome @Madhuraj

SAMPLE.pdf (84.4 KB)
I want to get whatever number present in the pdf … eg .only number “9663743” from this pdf

Madhuraj · March 9, 2017, 11:08am

@chinmay_dhabal,

Refer to below mentioned screen shot.

Regards
Madhura

Sam316 · November 9, 2017, 5:03pm

I get permission error when I tried this. Any suggestions why? thanks

ovi · November 9, 2017, 5:56pm

Hi Sam,

What is the error? “Permission missing: Launcher”?
If so, please refer to this post:

Sam316 · November 13, 2017, 5:20pm

That post was helpful, Thank you

harsha1123 · May 8, 2018, 8:09am

Hi Everyone

I have a doubt, For example i have to Run a cycle in some data processing tool, after every run the tool will update the start time and Finish time data down side to the last run data.
Now i want to read the latest Finish time and write into an excel sheet. Can anyone help me in this. Below is the Example.

Start:
Start time:12:00:00PM
Finish time: 12:15:00PM

Restart1:
Start time:01:00:00PM
Finish time: 01:15:00PM

Restart2:
Start time:03:00:00PM
Finish time: 03:15:00PM

Topic		Replies	Views
Extract number from pdf Help	7	4830	June 25, 2018
Extract digit value regex Help	11	2264	August 2, 2019
Specific Data from PDF sheet Help	30	1758	September 2, 2019
Get text using Regex Activities pdf , activities , question	7	1054	June 12, 2022
Get String of Particular Format from String Help activities , string , question , data_manipulation	2	846	November 17, 2019

How to get only numbers from PDF file?

Related topics