How to delete duplicate rows in Excel file based on multiple columns and with specific check on one column

WASEEM_KHAN · August 22, 2021, 3:56pm

Hi,

I know such kinds of questions are asked frequently but they did not solve my problem.

I have an excel file that contains half of million data. An example picture is following:

Now First I want to check if there are any duplicate rows in first six columns(Customer, Document, Item, Material, Date, Origin) and if there are then delete the entire rows till the column “Status”.

The other check is on column “Datum Satzerzeugung”. If there are duplicate rows based on above six columns and if the date in “Datum Satzerzeugung” is different then delete the row on latest date and keep the row on old date. For example, look at the below figure

The rows 7 and 8 are duplicate based on first six columns but the date in “Datum Satzerzeugung” is different for both rows. So, delete the entire row 8 and keep row 7.

Please provide the complete solution as my knowledge in using LINQ expression is limited. I managed to delete rows based on single column but on multiple column I could not succeed.

AshwinS2 · August 22, 2021, 7:11pm

Hi @WASEEM_KHAN

Try this

Thanks
Ashwin.S

Dawodm · August 22, 2021, 10:08pm

Hi @WASEEM_KHAN

Please check this video: UiPath | Remove Duplicate Rows from Excel / DataTable using two columns | Delete Duplicate Rows LINQ - YouTube

Best regards
Mahmoud

kumar.varun2 · August 23, 2021, 4:57am

@WASEEM_KHAN

Can you share sample excel file?

I have designed a workflow on dummy data. You tailor it according to your requirement

The LINQ used is


(From row In inputDT
Group row By 
k1 = row("ID").ToString,
k2 = row("DocNum").ToString
Into grp=Group
Let md = grp.Min(Function (x) CDate(x("Date").ToString))
Let fr = grp.Where(Function(x) x("Date").ToString.Equals(md.ToString))(0)
Select outDT.Rows.Add({k1, k2, fr("Date"), fr("Status") })).CopyToDataTable

For your reference

LINQ For Group By Multiple Columns.xaml (11.9 KB)

WASEEM_KHAN · September 26, 2021, 2:25pm

@kumar.varun2 I need your help. Now I have the data in sql table which is on sql server. I want to do the same thing on sql table. Could you please modify your above solution so that I can apply it on sql table?

Topic		Replies	Views
Delete Duplicate Excel Rows Based On One Column Studio studio , question , activities_panel	8	2248	November 3, 2021
Remove duplicate rows based on specific columns? Studio question	8	4596	September 1, 2021
Remove duplicates from excel considering specific columns Help	2	2100	June 12, 2020
Remove duplicate rows based on identical values in two columns Studio	2	2827	September 10, 2020
Delete duplicate row if one column contain certain special string Academy Feedback academic_alliance , question	5	1297	August 23, 2021

Most Active Users - Yesterday
Anil_G
ashokkarale
jinal.shah
Gautham_Pattabiraman
postwick
chandreshsinh.jadeja
vrdabberu
Ajay_Mishra
sven.wullum1
Vyshnavi_Nalumachu
More details...

How to delete duplicate rows in Excel file based on multiple columns and with specific check on one column

Related Topics