How to get only numbers from PDF file?



Suppose there is a pdf containing

Name: xyz
Phone number: 9663743

i want to get “9663743” i.e numbers .

so this program should be able to extract all the numbers present in the pdf…

I tried with “Number Only” option from Dropdown menu of Google OCR but that is giving different result

Extract number from pdf

-Get the content of the pdf into string using “Read Pdf text activity”.
-Use “Matches” activity with input string as pdf string and pattern as "\d{length}"
replace length as the length of numbers you want to extract ie for 9663743 regex is "\d{7}"
if length is 7 or more digits use expression "\d{7,}
-output will be the IEnumberable of Match, you can get each value from the ienumerable using the for each actvity with type as Match.



Read the entire PDF Text and extract number values from the same.

Please find below 2 methods

Hope it helps…:slight_smile:



Thanks @palindrome @Madhuraj

SAMPLE.pdf (84.4 KB)
I want to get whatever number present in the pdf … eg .only number “9663743” from this pdf



Refer to below mentioned screen shot.



I get permission error when I tried this. Any suggestions why? thanks


Hi Sam,

What is the error? “Permission missing: Launcher”?
If so, please refer to this post:


That post was helpful, Thank you


Hi Everyone

I have a doubt, For example i have to Run a cycle in some data processing tool, after every run the tool will update the start time and Finish time data down side to the last run data.
Now i want to read the latest Finish time and write into an excel sheet. Can anyone help me in this. Below is the Example.

Start time:12:00:00PM
Finish time: 12:15:00PM

Start time:01:00:00PM
Finish time: 01:15:00PM

Start time:03:00:00PM
Finish time: 03:15:00PM