Regex with 2 data with same pattern

Hi Guys,

I have an invoice in PDF which I need to take out the 10 digit number in Air Waybill Number column only.

I’m using matches activity to do this with the pattern of “([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])\s”.

However since in the Shipper Reference field also contain 10 digit number, the robot tent to take number in shipper reference too, which is not as desired.

If this is the case, what pattern should I put to ignore the Shipper Reference field’s number and only take the data from Air Waybill Number column only.

Any suggestion?

Hi @Serran_Neru,

use this pattern
Pattern-> ^([\d]+)


1 Like

Hi @arivu96,

Thanks, unfortunately it only takes the first number and doesn’t take the following number in Air Waybill Number column. Is there any other way? :frowning:

HI @Serran_Neru,

use read pdf text store it into string.

Then split the string using Environment.Newline you will get the array of line.
Then use for each to loop through it apply the regex pattern to get the number.


1 Like

Hi Arivu,

Its going interesting. Could you please let me know, what is the data type for reading each line in for each activity?

@Serran_Neru use string data type

Great, Thanks Arivu :slight_smile: it works

I have total of 3 different Regex patterns, is it possible to make a regex engine comprising of all the three patterns?
what i am currently doing is, my pdf text is passing through first pattern, if the count is returning 0, it goes through second pattern and so on.
Is it possible that the pdf text extracted passes through all the 3 patterns at once??
Any help would be highly appreciated.