About the Scenario and Approach:
Sheet 1 = D1
Sheet 2 = D2
Definition of a Group: D1.ID
Definition of a Match: In a Group D1.Value = D2.Value
- Group D1 Data
- Process every Group one by one - D1Stack
- Retrieve from D2 all items having the same Group Key - D2 Stack
- Find the matches between D1Stack And D2 Stack - Matches
- Remove Matches D1Matches from D1Stack, Similar doing it with D2
- take from D2Stack items of the count of remaining elements in D1Stack or less if D2Stack.Count < D1Stack Count
- Build the not matching pairs: d1(0)d1(1)d2(1)“Fail” - Unmatches
- Remove from D1Stack the d1 Unmatches
- Build the not having any d2 items d1(0),d1(1),“”,“Not Present” - Unbounds
About the Implementation:
50000 rows are a hint on Performance and more important on reliable and testable implementation. Doing it all in a Single LINQ will not adress these quality factors. So it was build by a bottom to up Approach. Does mean: within the different stages we have the chance to test the part results.
Example Log:
Implementation has the risks of fails when:
- Duplicates will occur
- Is not handling the case D2Stack.Count > D1Stack (is doing now a warning log)
Find starter help here
nabeelu_V2.xaml (19.7 KB)