Compare Excel based on Primary Column Value

Hello All
Good Day!

I am new to UI Path need help on one requirement. I have two Excel Sheets. Both has 4 Columns ( Person ID, Tax1, Tax2 and Tax 3). Lets assume that Peron ID 1 exist in both sheets. Now i want to compare based on below condition -

If Sheet1.Person_id=Sheet.Person_id Then
(
if sheet1.tax1 <> sheet2.tax1 or
sheet1.tax2 <> sheet2.tax2 or
sheet1.tax2 <> sheet2.tax2 or Then

(
Write to another Excel that tax are not matching person id 1.
)
)

Person Id 1 can be 2nd row in Sheet 1 and 20th row in Sheet 2. I have reviewed some of the Excel Compare posts but could not find the exact solution.

Thanks everyone for anticipated help!

Thanks
Gaurav

Hello UI Path Gurus

Could you please help me in my requirement?

Thanks
Gaurav

@aksh1yadav…Hi Akshay…could you please review my requirement, i saw many of your post for similar Topics. Thanks in advance for your anticipated help!

Thanks again
Gaurav

Hi @gauravb20ui
1.use multiple read range
2.use for each based on datatable
3.use if row ("person_id).equals row1 (“person_id”) then
If not row (“tax_id”).equals row1 (“tax_id”) Use conditions like that and implement it

Thanks
Ashwin.S

1 Like

Hi @gauravb20ui,

Requirement :

image

Solution : Using LINQ

image

PFA project file for your requirement,
GetRowsBy_MatchOneColumn_NoMatchThreeColumn.zip (10.3 KB)

2 Likes

Thank You so much @Dominic…it is working fine. I have below two points where i need your help-

  1. If i have to pull Row from both Sheets if there is mismatch so that user can see what exactly not matching b/w Sheet1 and Sheet2.

  2. how i can find the person ids which are either missing in Sheet1 or Sheet 2 and report it into as separate sheet in output file?

Thanks again
Gaurav

Hi @Dominic

Could you please let me know how i can select rows from both sheet 1 and Sheet 2 if condition matches. I have below select statement which is only pulling data from sheet1.

(From rowSheet1 In dtInputExcelSheet1.Select()
Where (From rowSheet2 In dtInputExcelSheet2.Select()
Where Convert.ToInt32(rowSheet1(“PersonId”).ToString).Equals (Convert.ToInt32(rowSheet2(“PersonId”).ToString)) And
( Convert.ToInt32(rowSheet1(“Tax1”).ToString)-(Convert.ToInt32(rowSheet2(“Tax1”).ToString))>500 Or
Convert.ToInt32(rowSheet1(“Tax2”).ToString)-(Convert.ToInt32(rowSheet2(“Tax2”).ToString))>500 Or
Convert.ToInt32(rowSheet1(“Tax3”).ToString) - (Convert.ToInt32(rowSheet2(“Tax3”).ToString) )>500)
Select rowSheet2).Count>0
Select rowSheet1).CopyToDatatable()

Many thanks for all your help!
Gaurav

Hi @gauravb20ui, Can you give us the condition ? Sample excel with input and executed output will be more helpful :slight_smile:

Output.Xlsx (8.2 KB)
Input.xlsx (262.2 KB)

Hi @Dominic
Thanks for your response again, i have attached the input and output sheet. Condition can be anything like if difference for Tax1 in Sheet 1 and Sheet2 is > 500 then pull records etc.

Could you please review and let me know if you need anything else from side.

@gauravb20ui, Sorry for the delayed response and this scary post ! :hushed:

Different ways

NOTE: Code not tested :neutral_face:

1)Extract two sheets values separately and Merge them

(Quite Time Consuming process) - Like for each (two)

dtFilteredTable1 = (From rowSheet1 In dtInputExcelSheet1.Select()
                    Where    (From rowSheet2 In dtInputExcelSheet2.Select()
                              Where CInt(rowSheet1("PersonId").ToString).Equals(CInt(rowSheet2("PersonId").ToString)) And
                             (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                              Select rowSheet2).Count>0
                    Select rowSheet1).CopyToDatatable()

dtFilteredTable2 = (From rowSheet2 In dtInputExcelSheet2.Select()
                    Where    (From rowSheet1 In dtInputExcelSheet1.Select()
                              Where CInt(rowSheet1("PersonId").ToString).Equals(CInt(rowSheet2("PersonId").ToString)) And
                             (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                              Select rowSheet1).Count>0
                    Select rowSheet2).CopyToDatatable()

Merge the both to get desire output

2)Union method of IEnum

(Only in a single query)[Same like above method but only one assign]

dtFilteredTable1 = (From rowSheet1 In dtInputExcelSheet1.Select()
                    Where    (From rowSheet2 In dtInputExcelSheet2.Select()
                              Where CInt(rowSheet1("PersonId").ToString).Equals(CInt(rowSheet2("PersonId").ToString)) And
                             (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                              Select rowSheet2).Count>0
                    Select rowSheet1).UNION(
                    (From rowSheet2 In dtInputExcelSheet2.Select()
                     Where    (From rowSheet1 In dtInputExcelSheet1.Select()
                               Where CInt(rowSheet1("PersonId").ToString).Equals(CInt(rowSheet2("PersonId").ToString)) And
                              (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                               Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                               Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                               Select rowSheet1).Count>0
                    Select rowSheet2)).CopyToDatatable()

3)Use Join clause and Add to datatable

(Comparatively takes less time as we have only one main query and chance of neglecting non-matching PersonId column values) [But needs a careful supervision ]

String strvalues = String.Join(vblf,
                    (From rowSheet1 In dtInputExcelSheet1.Select()
                     Join rowSheet2 In dtInputExcelSheet2.Select() On  rowSheet1("PersonId").ToString Equals 
                     (rowSheet2("PersonId").ToString)
                     Where          (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                                     Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                                     Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                     Select String.Join(",",rowSheet1.ItemArray.Select(Function(x) 
                     x.ToString.Trim))+vblf+String.Join(",",rowSheet2.ItemArray.Select(Function(x) 
                     x.ToString.Trim))))

Above returns only values to add the header and append to the output of the string use the below query,

String strColumnHeaders = String.Join(",",(From col In dtInputExcelSheet1.Columns.Cast(Of System.Data.DataColumn)
                          Select CStr(col.ColumnName).Trim))

Manipulate the string - as the columnValues are separated by “,” and rows separated by vblf[Environment.NewLine], build a datatable and add the values by using add data row

4)Traditional For Each row with IF Condition

For each row and place a IF condition and add it to new datatable

Lets too wait for others response… Expecting some other shortcuts as well as I am in pace of exploring LINQ.

1 Like