Need to extract values/numbers based on rules and add them

Hi all,

**I have below type of data in excel column which is extracted from app and text added to DT then excel .Here I need 2 results , **

**IND can be DBA or STA or GJK **
after this u can see number and values, 1) I need to add MR and MP values 2) Need to add NON MR and NON MP values ( ZT and F9 here)

there can be 2 or more MR, more MP, or no MP MR . if no MP MR then output should be 0
MP and MR are fix, but instead of ZT F9 can be any alphabets or alphanumeric of 2 digit. but start ill be fix IND or DBA or STA or GJK

DATA AS below :

hjdsgdgs bdjashd
T T 23 3B xt hk 8 . 78 gh . 89 hj
IND 120000-MR
IND 1200-ZT F9 IDT 567889-G6
IND 120000-MP I5 IND 6777887-BP
IND 1278-F9 shd jtjjh 8990 46-MP
IND 123

samples are attached in excel for me there will be only 1 sheet at a time as a transaction.
, pls help this needs Linq or regex for sure. I know basics but cant build complex hence posting here.
Input.xlsx (16.4 KB)

Hi @8b6861dc5e0c9f7008548ca66

Try this

 MRValues = (From row In dt.AsEnumerable()
                   Let line = row(0).ToString()
                   Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK) \d+-MR")
                   Select Convert.ToInt32(System.Text.RegularExpressions.Regex.Match(line, "\d+").Value)).Sum()

MPValues = (From row In dt.AsEnumerable()
                   Let line = row(0).ToString()
                   Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK) \d+-MP")
                   Select Convert.ToInt32(System.Text.RegularExpressions.Regex.Match(line, "\d+").Value)).Sum()

OtherValues = (From row In dt.AsEnumerable()
                      Let line = row(0).ToString()
                      Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK) \d+-(?!MR|MP)[A-Z0-9]+")
                      Select Convert.ToInt32(System.Text.RegularExpressions.Regex.Match(line, "\d+").Value)).Sum()

Regards,

1 Like

Thanks for your reply, getting end of expression expected error in 1st , expression , cross checked 3 4 times.

@8b6861dc5e0c9f7008548ca66

Can you please share the Screenshot of Error and Code

Regards,

I apologies, my mistake . I missed opening bracket (From.

I shall check all expressions and will revert here.
TIA1

1 Like

hi @lrtetala
thanks a lot , 1st 2 expressions are working fine for integers, since values can also be in decimal like 56.7, 2322.78 etc , I changes the variable types to double and also changed .ToDouble in code.
But it’s still giving me 0 result.

for INT it’s perfect , I think do we also have to change something in regex if i ant decimal nums ? pls help

in that case will it work for int/double/decimal etc?

@8b6861dc5e0c9f7008548ca66

Can you try the below i have modified regex pattern and Convert ToDouble

MRValues = (From row In dt.AsEnumerable()
                   Let line = row(0).ToString()
                   Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK) \d+(\.\d+)?-MR")
                   Select Convert.ToDouble(System.Text.RegularExpressions.Regex.Match(line, "\d+(\.\d+)?").Value)).Sum()

MPValues = (From row In dt.AsEnumerable()
                   Let line = row(0).ToString()
                   Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK) \d+(\.\d+)?-MP")
                   Select Convert.ToDouble(System.Text.RegularExpressions.Regex.Match(line, "\d+(\.\d+)?").Value)).Sum()

OtherValues = (From row In dt.AsEnumerable()
                      Let line = row(0).ToString()
                      Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK) \d+(\.\d+)?-(?!MR|MP)[A-Z0-9]+")
                      Select Convert.ToDouble(System.Text.RegularExpressions.Regex.Match(line, "\d+(\.\d+)?").Value)).Sum()

Regards,

1 Like

hi @lrtetala

thanks for your instant replies!
**For expression 1 and 2->I am able to get the decimal/ non decimal values and their addition too. :slight_smile: only 1 challenge is if there are more space between IND
120000-MP then it is giving 0 value.

IND < 1 and 2 space working > more than 2 space giving 0 as a output. I tested on many scenarios.

**For expression 3 β†’ , it is giving values, only of IND 1200-ZT this kind but IND 1278-F9, not giving 1278 I think it is taking only the values having 2 alphabet , and not F9 /D9 types and not adding .

pls help

@8b6861dc5e0c9f7008548ca66

To handle cases with varying spaces between the prefix and the number, and to capture both alphabet and alphanumeric suffixes accurately,

MRValues = (From row In dt.AsEnumerable()
                   Let line = row(0).ToString()
                   Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK)\s+\d+(\.\d+)?\s*-MR")
                   Select Convert.ToDouble(System.Text.RegularExpressions.Regex.Match(line, "\d+(\.\d+)?").Value)).Sum()

MPValues = (From row In dt.AsEnumerable()
                   Let line = row(0).ToString()
                   Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK)\s+\d+(\.\d+)?\s*-MP")
                   Select Convert.ToDouble(System.Text.RegularExpressions.Regex.Match(line, "\d+(\.\d+)?").Value)).Sum()

OtherValues = (From row In dt.AsEnumerable()
                      Let line = row(0).ToString()
                      Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK)\s+\d+(\.\d+)?\s*-(?!MR|MP)[A-Z0-9]{2}")
                      Select Convert.ToDouble(System.Text.RegularExpressions.Regex.Match(line, "\d+(\.\d+)?").Value)).Sum()

Regards,

1 Like

hi @lrtetala
**For expression 1 and 2-> working as expected and fine :slight_smile: thanks !

**For expression 3- Now it is adding Alphanumeric and alpa suffixes but it is including MR or MP also in addition , we need non MR MP to be added.

(?!MR|MP means to exclude right , still its considering. if it got fixed then all will be ok.
e.g.
IND 120000-MR
IND 1200-ZT F9 IDT 567889-G6
IND 120000-MP I5 IND 6777887-BP
IND 1278-F9 shd jtjjh 8990 46-MP
IND 123

expected output from 3rd expression is β†’ 1200 (ZT)+1278(F9)=2478
but current expression giving MR+ZT+MP+F9

TIA !

@lrtetala

for now I am subtracting, mp mr total from result of expression 3 and that is satisfying my requirement . thank you so much !!!

@8b6861dc5e0c9f7008548ca66

Try this

(IND|DBA|STA|GJK)\s+\d+(\.\d+)?\s*-(?!MR|MP|BP)[A-Z0-9]{2}

(From row In dt.AsEnumerable()
                      Let line = row(0).ToString()
                      Where System.Text.RegularExpressions.Regex.IsMatch(line, "(IND|DBA|STA|GJK)\s+\d+(\.\d+)?\s*-(?!MR|MP|BP)[A-Z0-9]{2}")
                      Select Convert.ToDouble(System.Text.RegularExpressions.Regex.Match(line, "\d+(\.\d+)?").Value)).Sum()
1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.