How to extract data in 4 different variables and these data length and values may vary

data3
“10” “11301CD1” “2” and from “Irrigation to BDX” and also length may vary and these strings can also vary in different files?

text format of image is

“10 11301CD1 2 Irrigation Adaptor, for machine
cleaning, reusable,
for use with Flexible Intubation Video
Endoscopes 11301 BNX and 11302 BDX
Country of origin: EE”

Hi @mhk15,

your expected output, can you provide the sample

Regards, Arivu :slight_smile:

Expected output
Variable 1= 10
Variable 2= 11301CD1
Variable 3= 2
Variable 4=Irrigation Adaptor, for machine
cleaning, reusable,
for use with Flexible Intubation Video
Endoscopes 11301 BNX and 11302 BDX

And the value of variables may vary also the length. and and the format of text file is displayed in image. if we use delimiter space the variable4 can not be extracted…

I tried through the length and substring but problem is that variable 1,2,3 can have more lengths. only way to extract this text is using space as delimiter but variable 4 contains space itself

HI @mhk15,

Follow the below steps

Assign the string into-> strvalue
strvalue="10 11301CD1 2 Irrigation Adaptor, for machine cleaning, reusable, for use with Flexible Intubation Video Endoscopes 11301 BNX and 11302 BDX Country of origin: EE"

use Assign Activity
strvalue=strvalue.replace(Environment.NewLine,"")
use Assign Activity
create a varibale arrvalue->string
arrvalue=strvalue.split(" "c)

Variable 1=arrvalue(0)
Variable 2=arrvalue(1)
Variable 3= arrvalue(2)

Use Match activity

Variable 4:
Use Matches activity
Properties
Input : strvalue
Pattern :(?=^([0-9]* [0-9A-Z]* [0-9]*)).*(?=Country of origin:)
Result: iEnumResult ->IEnumberable
after that use assign activity to get the data
Variable 4=iEnumResult (0).ToString()

Regards,
Arivu

Thanks arivu96,
This code helped to extract the data but in variable 4 it is still showing all the data.

Output of variable4 is error2=10 11301CD1 2 Irrigation Adaptor, for machine
cleaning, reusable,
for use with Flexible Intubation Video
Endoscopes 11301 BNX and 11302 BDX

Required output for variable 4 is = Irrigation Adaptor, for machine
cleaning, reusable,
for use with Flexible Intubation Video
Endoscopes 11301 BNX and 11302 BDX

Hi Mehak,

Need a small code tweak to get the fourth value,

Regex Pattern : [ To be used in Matches activity ]

"(?<=("+String.Join(" ",{strval1,strval2,strval3})+"))(\n|.)*(?=(Country of origin:))"

where

  • strVal1 to strVal3 are the first three values (Refer below xaml)

Note: we can also try addingup a regex pattern to match the numbers before the word Irrigation… as Arivu used.

Reference:

Regex_Extract Values.zip (3.0 KB)

thanks it solved the problem…

1 Like