How to split a string based on a certain format using Regex?

hey folks!

i’ve been working on a process that performs a data entry function after i’ve extracted certain key information from a string of data given in a statement.

So far, i’ve been able to get some success by breaking down my string manipulation into a few steps i.e. first splitting the string by spaces, then searching for the keyword that i need from the array and further split them up before i validate the information i need using regex statements. However, it seems like there’s a more efficient way of doing this by splitting and validating the information at the same time using regex statements.

A sample of the type of string i receive would be:

Inward PayWave LBC T17FC0060B PTE 01 SM3P191118448667 C110013055337 OTHER ABC ENGINEERING PTE. LTD. AUD 600

And the only information that i need starts from “LBC” and “T17FC0060B PTE 01”. Although instructions have been given to users to enter this in a format of LBC-<9/10 alphanumeric characters>-PTE-01 but as the source comes from a vendor, we have no control over how these information gets entered.

Thanks to the help of other users in this forum, i’ve been able to validate after splitting such entries with the following regex statement: System.Text.RegularExpressions.Regex.IsMatch(strData,“^\w{9,10}-?PTE-?0\d$”).

However, by first splitting this string by spaces, i would have failed to validate entries that were entered using the format given in the sample above. Therefore, would then be possible to split the chunk of string according to “^\w{9,10}-?PTE-?0\d$” instead?

many many thanks in advance!

1 Like

Hi

str_output = System.Text.RegularExpressions.Regex.Match(str_input,”(LBC).*(PTE\s\d+)”).ToString

This will give us that value like
LBC T17FC0060B PTE 01

Cheers @Psyence

@Psyence

Try below Regular expression.

              requiresStr = System.Text.RegularExpressions.Regex.Match(inputStr,"LBC\s*\w+\sPTE\s\d{2}").ToString

Hi both! Thanks for the replies!

Both statement works however, the correct format that users are supposed to enter is along the lines of “LBCT17FC0060B-PTE-01” where LBC is connected to the first character and PTE and 01 are both separated by hyphens.

As this input i purely determined by our external users we have no control over what they key in. So i would like to cater to as many variations as i possibly can. Is there a way to amend this regex statement to cater for different variations like missing hypens, extra spaces etc.?

One way that i’ve worked out was to first remove hyphens and spaces then run and edited regex statement like:

requiresStr = System.Text.RegularExpressions.Regex.Match(inputStr,"LBC\w+PTE\d{2}").ToString*

Are there better methods to go about achieving what i want?

1 Like

@Psyence

Yes it also better solution.

But other alternative is to you can split it based on hypen and can read required values. I guess it will be easiest solution.

str = "LBCT17FC0060B-PTE-01"

str.Split("-“C)(0) - LBCT17FC0060B
str.Split(”-“C)(1) - PTE
str.Split(”-"C)(2) - 01

1 Like

Yah of course
In same expression with Regex

str_output = System.Text.RegularExpressions.Regex.Match(str_input,”(LBC).+(PTE\W\d+)|(LBC).+(PTE\s+\d+)|(LBC).+(PTE\d+)”).ToString

Cheers @Psyence