Regex extraction for line item

Hi all,

I have a string ,need to extract the value which is after the particular word,it may be in the next line or in the same line,
Ex: In the below string I need to extract the value ‘XS2031323285’ which is next to ISIN.

The problem is,I can extract the word when ISIN and ‘XS2031323285’ will be in same line,but when ISIN in one line and ‘XS2031323285’ in the next line ,not able to extract.

Senerio 1 :
INDUSTRY government development banks ISIN XS2031323285
H ABC equals

Senerio 2 :
INDUSTRY government development banks ISIN
XS2031323285 H ABC equals
Tea factory 2

In the above 2 example I need to extract
XS2031323285

How can we do this using regex

Hi @yashashwini2322

(?<=ISIN\s)[A-Z]+[0-9]{10}

Hope it helps!!

10 digit is not constant always

I have given the example ,but XS2031323285 is not constant it may be XS234
Or MS25363936353

@yashashwini2322

Is ISIN common for you?
If yes, use this regex

(?<=ISIN\s)[A-Z]+[0-9]+

Hope it helps!!

Hi @yashashwini2322

You can use the below regular expressions to extract the required output.

System.Text.RegularExpressions.Regex.Match(yourstringinput.ToString,"((?<=[A-Z]+\s+\n?)[A-Z]+[0-9]+)").Value

Hope it helps!!

I want to extract the immediate next word after the ISIN , irrespective of digit and character,it may be in the same line or in the next line

1 Like

Hi

You can try this expression

\b(\w+\d+)\b

It takes any format

Cheers @yashashwini2322

Okay Then @yashashwini2322

Use the below one.

System.Text.RegularExpressions.Regex.Match(yourstringinput.ToString,"((?<=ISIN\s+\n?)[A-Z]+[0-9]+)").Value

I have hardcoded the value of ISIN because it is constant in every time and have to pick the next code word what we have in the input as you said.

Hope you understand!!

HI @yashashwini2322

You can try this regex expression it will match the extract the word when ISIN and it will be in same line or multi line it will pick exactly once try this

you can see the output

You can try this regex expression full

System.Text.RegularExpressions.Regex.Match(str_input,"(?<=ISIN[\s*])[A-Z0-9]+|[0-9A-Z]{10,}").Value.Trim
Note : str_input = "INDUSTRY government development banks ISIN
                                XS2031323285 H ABC equals
                                 Tea factory 2"

@yashashwini2322


@yashashwini2322 try with regex:(?<=ISIN).*|(?<=ISIN\n)[^ ]+

@yashashwini2322

Please tey this

(?<=ASIN\s+)\b.*\b

Cheers

1 Like

hi @yashashwini2322

System.Text.RegularExpressions.Regex.Match(yourstringinput,“(?<=ISIN\s)[A-Z]+[0-9]+)”.value

Hope the suggestions helped you solve this

Let us know for further clarification
@yashashwini2322

Hi,

You can use below expression; this will work surely. Doesn’t matter your word is in same line or next line, may be XS234
Or MS25363936353 .

System.Text.RegularExpressions.Regex.Match(yourInputString, “\bISIN\s+([A-Z0-9]+)”).Groups(1).Value