Will it be solvable through regex funtion and what is the appropriate regex function to retrieve data


#1

regex
If From the text “10 20134083 1 Hot Mirror for 20134020 23,32 23,32
Country of origin: US
delivery date: 09.05.2018 ex works
50 5806412 1 FLUX CREAM NC5070 F-SW 32 14,31 14,31
Country of origin: CH”
data to be retrieved is Variable 1: 10
Variable 2:20134083
Variable 3: Hot Mirror for 20134020
Variable 4: 23,32
Variable 5: 23,32
similarly for the next item and there may be variation in length of the string and also strings value will vary. What is the possible way to extract this data and if regex function Is required then which regex pattern will help?
Text is also provide in iamge


#2

It seems like this can be done through using String.Split (see MSDN documentation). I would first split the items. I can’t fully judge as I haven’t seen how consistent your data is but you should first split the items such that you have an array of items which you can then further split to retrieve the variables. Once you have an array of the items, you can split on spaces and then assign the variables as it seems that the index would remain consistent. In your example, index 0 is variable 1, index 1 is variable 2. Now, the item description is more difficult so you’d have to evaluate which array index contains the regex function that evaluates DD,DD (D=decimal). In the case of hot mirror you would find the regex to return true in index 7 so your variable 3 would be String.Concatenation of index 3,4,5,6. Variable 4 would be index 7 and variable 5 is index 8.

If you know exactly what you can expect regex is still nice to evaluate all your variables. In case you want to test your regex functions I’d advise www.regex101.com and http://www.rexegg.com/regex-quickstart.html
Hope this helped.


#3

But there might be the possibility that the description may contain more data. We can’t limit the index values to be concatenated. Index 7 can also be included in description.


#4

Yes that’s right but the string contatenation should include the indexes until your regex evaluates true for your variable 4. So that would be dynamic. In your second example it would be index 3 to 7.