I am trying to remove duplicates in a data table using the following assign activities of
OpDataTable = (From i In OpDataTable.AsEnumerable
Group i By s=i(“ADD 1”).toString.Trim.ToUpper Into grp=Group
Select s = grp.First()).CopyToDataTable
This is not getting the best results as it seems to be skipping a name in the data due to the address matching. My goal is to remove all the duplicates and be left with the lowest EFF DT for each name , NPI and location so Here is the raw data
If you look at the data above what I have is duplicate records for one NPI - of the same name at the same location but with different EFF dates and they would like to sort the data and remove all the duplicates and only keep the least effective date for that grouping. I tried to do it if you look at the 2nd image, but that didn’t work so well . I only want one row of NPI, Name, address and least Effective date, my data is probably grabbing the 1st effective date.
Here is a small sample file in it you can see duplicates for one NPI only difference is date and address so we only want one of each address with the least effect date to be written to another sheet. sampledata.xlsx (10.8 KB)
Well I made a change OpDataTable.AsEnumerable().GroupBy(Function(a) Tuple.Create(a(“ADD 1”).ToString, a(“PRIMARY_NPI”).ToString)).Select(Function (b) b.First).CopyToDataTable() and that gave me all the data I needed except it gives me the 1st Effective date not sure I could change this to get me the lowest effective date.