Extract Specific data from pdf using string manipulation or Regex

“Non-Negotiable Sea Waybill For Port to Port or Combined Transport (1) SHIPPER/ EXPORTER (4) DOCUMENT NUMBER (5) SEA WAYBILL NUMBER CUMMINS INC-DIVISION-JV KITTING US USOA122-00023113 DTW221352435 BC 604 500 JACKSON ST MC91661 COLUMBUS IN 47201 (6) REFERENCES NOS: UNITED STATES NRA1639078616”

I want to extract Sea way bill number as - DTW221352435

HI @Ajij_Mujawar

How about this Regex Expression


Hi Gokul
For one invoice it is working but for other inv it is not i am sharing both inv can u please
DTW221352435-pages-11.pdf (80.4 KB)
ABZ221273966-pages-4.pdf (83.0 KB)

Hi @Ajij_Mujawar ,

Maybe the Sea Way Bill Number is restricted to this Pattern - There should be 3 Letters at the Start and then followed by 9 Digits. If so, then we could give a try on the below Expression :

(?<=SEA WAYBILL NUMBER.*)\b\w{3}\d{9}\b

If you do know more info about the pattern of the Sea Way Bill Number, do share it so that we could suggest the appropriate regex.

Test the patter with multiple inputs and let us know if it doesn’t work for some.

Hi Arpan,
we are getting pattern for Air way Bill CV2300023245, CKG230115757 so previus expresion not working for 1st pattern

1 Like

Thanks Gokul

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.