Help required to remove the duplicate rows based on columns combination

PR_header.xlsx (113.4 KB)

In the attached fiel the first four columns are keys, if that combination repeats I should take only a row with max value of Hdr_seq column for processing. Please suggest me with approach for doing the same at the earliest. I’m on community edition 19.7

Hi @Ramki81

You can do this easily if you get the below component installed.

  1. Read range to read the excel to a datatable
  2. The above mentioned component includes an activity named Data Table Consolidate. Using this activity, you can group the data table from the four key column combinations to get the maximum of the HDR_Seq column into another data table. So the aggregated data table will have the four key columns and the max of the Hdr_Seq column.
  3. Next we can use the two data tables we have and do a Inner Join using Join Data Table activity to filter the full dataset to get the required output :slight_smile:

I have done it here for you as an example on how to use the component and to demonstrate how to get it done easily… You may tweak it as you need. It works perfectly to get your required output.

Make sure to install the above mentioned component before opening the file. You can install it through the Package Manager of the Studio

RemoveDuplicatesByGettingMaxValue.xaml (9.6 KB)

Let me know whether it helps…

3 Likes

Super buddy, It works. Thank you very much for the timely help.

I have one small issue, In the final data table the column Hdr_Seq got has become the last column. It is supposed to be in Column 5(@ E) only as in source file.

1 Like

I have fixed it using Invoke method activity to change the column place.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.