Compare multiple columns in excel

@nabeelu

About the Scenario and Approach:
Sheet 1 = D1
Sheet 2 = D2
Definition of a Group: D1.ID
Definition of a Match: In a Group D1.Value = D2.Value

  • Group D1 Data
  • Process every Group one by one - D1Stack
  • Retrieve from D2 all items having the same Group Key - D2 Stack
  • Find the matches between D1Stack And D2 Stack - Matches
  • Remove Matches D1Matches from D1Stack, Similar doing it with D2
  • take from D2Stack items of the count of remaining elements in D1Stack or less if D2Stack.Count < D1Stack Count
  • Build the not matching pairs: d1(0)d1(1)d2(1)“Fail” - Unmatches
  • Remove from D1Stack the d1 Unmatches
  • Build the not having any d2 items d1(0),d1(1),“”,“Not Present” - Unbounds

About the Implementation:
50000 rows are a hint on Performance and more important on reliable and testable implementation. Doing it all in a Single LINQ will not adress these quality factors. So it was build by a bottom to up Approach. Does mean: within the different stages we have the chance to test the part results.

Example Log:
grafik

Implementation has the risks of fails when:

  • Duplicates will occur
  • Is not handling the case D2Stack.Count > D1Stack (is doing now a warning log)

Find starter help here
nabeelu_V2.xaml (19.7 KB)

1 Like