Regex to fetch only number

Hi guys,

I need to fetch only policy number in the pdf given below . i.e 026703177603, 026703460301, 026800835002.
I know I have to use regex but am not able to do it. Kindly help me to achieve this

Hi @Vinutha_L, can you read the pdf and provide the text output.

@vishal.kp
sure

image

This seems to be in image format, unable to copy the text.

MonthlyCommission
Summary

Name and address of Producer Page 1
For the month of OCTOBER
CREATIVE PLANNING PROPERTY & CASUALTY LLC Summary date 10/30/2020
5440 W 110TH ST Producer number 23-5777
OVERLAND PARK, KS 66211 Subproducer

The following reflects commissions for Personal Lines policies on which payments have been received or premiums returned in this
account month. The commission rate shown is based on coverage, less surcharges, taxes and/or countersigning, as applicable.

Commission
Policy No./ Comm. Due
Name of Insured Account No. Transaction Amount Rate Producer
026703177603 341190737364001P Installment 168.50 14.00% 23.59
JONATHAN WOOD
026703460301 348381948819001P New line 660.75 16.99% 112.32
SALVATORE CAMPO
026800835001 644906520678001P Renewal 489.25 13.99% 68.49
DON P KNOPKE
026800835002 644906520678001P Renewal 1,143.00 14.00% 160.02
DON P KNOPKE
026800835003 644906520678001P Renewal 117.75 13.99% 16.48
DON P KNOPKE
026800835005 644906520678001P Renewal 613.25 13.99% 85.85
DON P KNOPKE
026802447901 174508895805001P Installment 937.60 13.99% 131.25
LISLE PAYNE
026802447902 174508895805001P Installment 1,491.30 13.99% 208.78
LISLE PAYNE
026802447905 174508895805001P Installment 155.70 13.99% 21.79
LISLE PAYNE
026804171102 138125156117001P Installment 8,523.00 14.00% 1,193.22
DR. ARTHUR ALLEN II
026805078501 725094095209001P Installment 299.81 13.99% 41.97
TODD O’NEIL
026805078502 725094095209001P Installment 574.76 13.99% 80.44
TODD O’NEIL
026805078503 725094095209001P Installment 83.88 13.98% 11.73
TODD O’NEIL
026805078505 725094095209001P Installment 4.76 13.44% .64
TODD O’NEIL
026805339603 802961354051001P Cancellation 426.00- 14.00% 59.64-
SALVATORE CAMPO

@Vinutha_L You can try the reg ex pattern “\b\d{12}\b” if the policy number is always 12 digits.

1 Like

@Vinutha_L
give a try on:

^\d+\b
^ - binding on line start
\d+ -taking digits
\b - closing with word end

1 Like

Hi @vishal.kp

its working for only 12 digits “\b\d{12}\b”…Thank you

1 Like

Hi @Vinutha_L what is the exact requirement ?
How many digits are to be fetched ?

Hi @treesa.maria,

I need to fetch the policy number as explained above. As given by @vishal.kp its working well only if the digits are 12. i need to fetch even if digits are more or less then 12.

Hi @Vinutha_L can you please try “\b\d+\b” if your requirement is to fetch the first n numbers

hi @treesa.maria

Thank you!!..its working but its taking all other unwanted digitds too…like date, rate, amt, account etc

Hi @Vinutha_L can you please share a sample input .The data which you have shared doesnot contain any date and the syntax I shared already will take the whitespace after the first consecutive series of numbers.

Hi ,

Name and address of Producer Page 1
For the month of OCTOBER
CREATIVE PLANNING PROPERTY & CASUALTY LLC Summary date 10/30/2020
5440 W 110TH ST Producer number 23-5777
OVERLAND PARK, KS 66211 Subproducer

The following reflects commissions for Personal Lines policies on which payments have been received or premiums returned in this
account month. The commission rate shown is based on coverage, less surcharges, taxes and/or countersigning, as applicable.

Commission
Policy No./ Comm. Due
Name of Insured Account No. Transaction Amount Rate Producer
026703177603 341190737364001P Installment 168.50 14.00% 23.59
JONATHAN WOOD
026703460301 348381948819001P New line 660.75 16.99% 112.32
SALVATORE CAMPO
026800835001 644906520678001P Renewal 489.25 13.99% 68.49
DON P KNOPKE
026800835002 644906520678001P Renewal 1,143.00 14.00% 160.02
DON P KNOPKE
026800835003 644906520678001P Renewal 117.75 13.99% 16.48
DON P KNOPKE
026800835005 644906520678001P Renewal 613.25 13.99% 85.85
DON P KNOPKE
026802447901 174508895805001P Installment 937.60 13.99% 131.25
LISLE PAYNE
026802447902 174508895805001P Installment 1,491.30 13.99% 208.78
LISLE PAYNE
026802447905 174508895805001P Installment 155.70 13.99% 21.79
LISLE PAYNE
026804171102 138125156117001P Installment 8,523.00 14.00% 1,193.22
DR. ARTHUR ALLEN II
026805078501 725094095209001P Installment 299.81 13.99% 41.97
TODD O’NEIL
026805078502 725094095209001P Installment 574.76 13.99% 80.44
TODD O’NEIL
026805078503 725094095209001P Installment 83.88 13.98% 11.73
TODD O’NEIL
026805078505 725094095209001P Installment 4.76 13.44% .64
TODD O’NEIL
026805339603 802961354051001P Cancellation 426.00- 14.00% 59.64-
SALVATORE CAMPO

continuedon thenextpage

©Chubb. 2016 Allrights reserved.

Reference Copy MonthlyCommission
Summary

Hi @Vinutha_L you need to take the policy number 026703177603 from rows like this (026703177603 341190737364001P Installment 168.50 14.00% 23.59
JONATHAN WOOD) or the entire PDF.

hi @treesa.maria
Am reading the entire pdf

Hi @Vinutha_L,

Check this link

Hi @Vinuatha_L
Try following or adjust as necessary:

\r\n\d{5,14}\b

\r matches a carriage return (ASCII 13)
\n matches a line-feed (newline) character (ASCII 10)
\d{5,14} matches a digit (equal to [0-9])
{5,14} Quantifier — Matches between 5 and 14 times, as many times as possible, giving back as needed (greedy)
\b assert position at a word boundary

or

\b\d{8,14}\b