How to remove duplicate row based on condition

Hello,

I have a problem statement to remove a row from excel where the subject and date cell equals to the other subject and row cell in another row. Basically, my idea is to have only one row where the subject is unique based on the date.

I am attaching sample exceldemo.xlsx (9.0 KB)

thank you in advance

1 Like

Can you post the screen shot of the excel file.

I have already attached the excel sheet.

Hi @mukesh_behera

You need to remove the duplicates from ‘Subject’ column. I have one doubt here.

you mentioned ‘based on the date’. I’m not cleared on that. Could you please elaborate on this.

Thanks.

@mukesh_behera
working with this data which is similar to your case:
grafik

deletion on excel can be done with delete row activity using the row index.

We can retrieve it with this LINQ:
grafik

(From t In dtData.AsEnumerable.Select(Function (x,i) Tuple.Create(i,x))
Group t By k1=t.Item2("Subject").toString.Trim, k2=t.Item2("Date").toString.Trim.Substring(0,10) Into grp=Group
Where grp.Count > 1
Let il =grp.Skip(1).Select(Function (tg) tg.Item1).toList
Select il).SelectMany(Function (sm) sm).OrderByDescending(Function (o) o).toList

Result (for this sample data):
grafik

Afterwards we can use this list within a for each and delete the rows. For excel we have to take about the offset as e.g. it has to be corrected for 1row= header name

Find starter Help here:
Get_Index_NonDistincts_2Col.xaml (7.5 KB)

Doing it directly in the datatable we shared with you a starter help on your other topic:

Hi @mukesh_behera ,

I created a practice workflow. Check if it matches your requirement. :slight_smile:

Input:

Output:

LINQ:
(From d In dt_Input.AsEnumerable Group d By k1=d(1).ToString.Trim, k2=d(2).ToString.Trim Into grp=Group Let a = grp.Select(Function ( r ) r(0).toString.Trim).ToList.First Let b = CDate(k2.ToString).ToString("dd-MM-yyyy") Let row = New Object() {a, k1, b} Select dt_Output.Rows.Add(row)).CopyToDataTable

Workflow:
Sample.zip (10.2 KB)

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.