I am trying to compare two excel sheet that have thousands of data. I would like to know how can these be compared quickly without using for each row activity
Sheet 1 has two columns: Id and Source Value.
Sheet 2 has two columns: Id and Target value
I want an output in sheet 3 with the columns Id, Source value, Target value and status
For each Id, it should populate corresponding source value from sheet 1 and target value from sheet 2 and update status as ‘Pass’ if source and target value matches or update as ‘Fail’
Use for each datarow on excel datatable 1 and inside that use another for each datarow on excel datatable 2
In the inside for each use IF condition to check the ID column matches,
if matches -
{ use another IF condition to check Source from Datatable1 = target from datatable 2,
if matches - use add datarow to output datatable and add the values in array**
else - Continue**
}
@Mihir_Mapara
Thanks for your solution. But my excel data can have 50000 row or more. Using for each row will take lot of time to process. Apart from for each row, is there any other option probably using code
For id 5, in source I have values o,e and u
In target I have value o and t
So I need one row to have 5,o,o,Pass (as given in your result)
Other two rows I need value of t for source (5,e,t,NotPresent) for one row and blank (5,u,NotPresent) in other row
For Id 5- Value in source is e, o, u
Value in target is o and t
As per your result for all source of 5 (e, o and u) target value matched is o
But my scenario is it should be one to one mapping
i.e Result 1: (5,o,o,PAss) value of target and source is o and o
Result 2: Second result it should not consider value o again for target. It should consider only target value t and blank
It should be
5, e,t,Not Present or 5, e,NotPresent (i.e for target it should consider only t or blank as o was already considered in previous result (Result 1))
Result 3: 3rd result of 5 should be - 5,u,NotPresent (if target value of t was considered in prev result (Result 2)) or 5,u,t,NotPResent (if target value blank was considered in prev result (Result 2))
can you send the excel with that you and me are working with the same data. I will have a look on it after my job work. But please let the mapping dict excact same as the corrected one so we can identify the issue faster
About the labels Pass, Faill, Not Present we can change later as by your needs. Currently I am more focused on a bug free implementation. Also do not focus on teh changed order (e.g all 5’s together).
Thanks
About the Scenario and Approach:
Sheet 1 = D1
Sheet 2 = D2
Definition of a Group: D1.ID
Definition of a Match: In a Group D1.Value = D2.Value
Group D1 Data
Process every Group one by one - D1Stack
Retrieve from D2 all items having the same Group Key - D2 Stack
Find the matches between D1Stack And D2 Stack - Matches
Remove Matches D1Matches from D1Stack, Similar doing it with D2
take from D2Stack items of the count of remaining elements in D1Stack or less if D2Stack.Count < D1Stack Count
Build the not matching pairs: d1(0)d1(1)d2(1)“Fail” - Unmatches
Remove from D1Stack the d1 Unmatches
Build the not having any d2 items d1(0),d1(1),“”,“Not Present” - Unbounds
About the Implementation:
50000 rows are a hint on Performance and more important on reliable and testable implementation. Doing it all in a Single LINQ will not adress these quality factors. So it was build by a bottom to up Approach. Does mean: within the different stages we have the chance to test the part results.
Example Log:
Implementation has the risks of fails when:
Duplicates will occur
Is not handling the case D2Stack.Count > D1Stack (is doing now a warning log)