Can't delete duplicate rows based on columns and keeping the one matching a condition in an other column

Hi there,

I’m currently working with dataTables and I’m having trouble deleting rows based on conditions.

For example I have the following dataframe :

I want to remove duplicates rows based on the Columns BASE_TTW and IMME but I want to keep the one which has “Ligne Valide” in the “Traitement” column.

For example the final dataTable should look like that :

image

first I remove duplicates for all columns.

Then I tried to create different dataTable using filter datatable activity and remove duplicate rows but it seems that that last activity can’t take conditions.

Is there a way to integrate SQL conditions in?

Maybe a solution using with Excel activity could be easier?

If you guys have any idea, it would be grateful to share it.

Thanks in advance,
Alexandre.

1 Like

@Alexandre_BARRERE
working with a similar sample dataset:
grafik

following LINQ can help:

(From d In dtData.AsEnumerable
Group d By k1=d(0).toString, k2=d(1).toString Into grp=Group
Let dedup = grp.Distinct(DataRowComparer.Default).toList
Let dedupf = dedup.Where(Function (x) x(2).toString.trim.Equals("Valid")).toList
Let fres = If(dedupf.Count >0, dedupf, dedup)
Select fres).SelectMany(Function ( r )  r).CopyToDataTable 
  • we do group the data by the two columns (col1, col2)
  • Let dedup = grp.Distinct(DataRowComparer.Default).toList - removing duplicates from groups
  • Let dedupf = dedup.Where(Function (x) x(2).toString.trim.Equals(“Valid”)).toList - filtering on valid
  • Let fres = If(dedupf.Count >0, dedupf, dedup) = if the valid status is present use the filtered list or not

Result:
grafik

find starter help here:
Group_2Col_CondFilterOnValueExistence.xaml (7.6 KB)