I know such kinds of questions are asked frequently but they did not solve my problem.
I have an excel file that contains half of million data. An example picture is following:
Now First I want to check if there are any duplicate rows in first six columns(Customer, Document, Item, Material, Date, Origin) and if there are then delete the entire rows till the column “Status”.
The other check is on column “Datum Satzerzeugung”. If there are duplicate rows based on above six columns and if the date in “Datum Satzerzeugung” is different then delete the row on latest date and keep the row on old date. For example, look at the below figure
The rows 7 and 8 are duplicate based on first six columns but the date in “Datum Satzerzeugung” is different for both rows. So, delete the entire row 8 and keep row 7.
Please provide the complete solution as my knowledge in using LINQ expression is limited. I managed to delete rows based on single column but on multiple column I could not succeed.