Compare Excel based on Primary Column Value


#1

Hello All
Good Day!

I am new to UI Path need help on one requirement. I have two Excel Sheets. Both has 4 Columns ( Person ID, Tax1, Tax2 and Tax 3). Lets assume that Peron ID 1 exist in both sheets. Now i want to compare based on below condition -

If Sheet1.Person_id=Sheet.Person_id Then
(
if sheet1.tax1 <> sheet2.tax1 or
sheet1.tax2 <> sheet2.tax2 or
sheet1.tax2 <> sheet2.tax2 or Then

(
Write to another Excel that tax are not matching person id 1.
)
)

Person Id 1 can be 2nd row in Sheet 1 and 20th row in Sheet 2. I have reviewed some of the Excel Compare posts but could not find the exact solution.

Thanks everyone for anticipated help!

Thanks
Gaurav


#2

Hello UI Path Gurus

Could you please help me in my requirement?

Thanks
Gaurav


#3

@aksh1yadav…Hi Akshay…could you please review my requirement, i saw many of your post for similar Topics. Thanks in advance for your anticipated help!

Thanks again
Gaurav


#4

Hi @gauravb20ui
1.use multiple read range
2.use for each based on datatable
3.use if row ("person_id).equals row1 (“person_id”) then
If not row (“tax_id”).equals row1 (“tax_id”) Use conditions like that and implement it

Thanks
Ashwin.S


#5

Hi @gauravb20ui,

Requirement :

image

Solution : Using LINQ

image

PFA project file for your requirement,
GetRowsBy_MatchOneColumn_NoMatchThreeColumn.zip (10.3 KB)


#6

Thank You so much @Dominic…it is working fine. I have below two points where i need your help-

  1. If i have to pull Row from both Sheets if there is mismatch so that user can see what exactly not matching b/w Sheet1 and Sheet2.

  2. how i can find the person ids which are either missing in Sheet1 or Sheet 2 and report it into as separate sheet in output file?

Thanks again
Gaurav


#7

Hi @Dominic

Could you please let me know how i can select rows from both sheet 1 and Sheet 2 if condition matches. I have below select statement which is only pulling data from sheet1.

(From rowSheet1 In dtInputExcelSheet1.Select()
Where (From rowSheet2 In dtInputExcelSheet2.Select()
Where Convert.ToInt32(rowSheet1(“PersonId”).ToString).Equals (Convert.ToInt32(rowSheet2(“PersonId”).ToString)) And
( Convert.ToInt32(rowSheet1(“Tax1”).ToString)-(Convert.ToInt32(rowSheet2(“Tax1”).ToString))>500 Or
Convert.ToInt32(rowSheet1(“Tax2”).ToString)-(Convert.ToInt32(rowSheet2(“Tax2”).ToString))>500 Or
Convert.ToInt32(rowSheet1(“Tax3”).ToString) - (Convert.ToInt32(rowSheet2(“Tax3”).ToString) )>500)
Select rowSheet2).Count>0
Select rowSheet1).CopyToDatatable()

Many thanks for all your help!
Gaurav


#8

Hi @gauravb20ui, Can you give us the condition ? Sample excel with input and executed output will be more helpful :slight_smile:


#9

Output.Xlsx (8.2 KB)
Input.xlsx (262.2 KB)

Hi @Dominic
Thanks for your response again, i have attached the input and output sheet. Condition can be anything like if difference for Tax1 in Sheet 1 and Sheet2 is > 500 then pull records etc.

Could you please review and let me know if you need anything else from side.


#10

@gauravb20ui, Sorry for the delayed response and this scary post ! :hushed:

Different ways

NOTE: Code not tested :neutral_face:

1)Extract two sheets values separately and Merge them

(Quite Time Consuming process) - Like for each (two)

dtFilteredTable1 = (From rowSheet1 In dtInputExcelSheet1.Select()
                    Where    (From rowSheet2 In dtInputExcelSheet2.Select()
                              Where CInt(rowSheet1("PersonId").ToString).Equals(CInt(rowSheet2("PersonId").ToString)) And
                             (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                              Select rowSheet2).Count>0
                    Select rowSheet1).CopyToDatatable()

dtFilteredTable2 = (From rowSheet2 In dtInputExcelSheet2.Select()
                    Where    (From rowSheet1 In dtInputExcelSheet1.Select()
                              Where CInt(rowSheet1("PersonId").ToString).Equals(CInt(rowSheet2("PersonId").ToString)) And
                             (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                              Select rowSheet1).Count>0
                    Select rowSheet2).CopyToDatatable()

Merge the both to get desire output

2)Union method of IEnum

(Only in a single query)[Same like above method but only one assign]

dtFilteredTable1 = (From rowSheet1 In dtInputExcelSheet1.Select()
                    Where    (From rowSheet2 In dtInputExcelSheet2.Select()
                              Where CInt(rowSheet1("PersonId").ToString).Equals(CInt(rowSheet2("PersonId").ToString)) And
                             (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                              Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                              Select rowSheet2).Count>0
                    Select rowSheet1).UNION(
                    (From rowSheet2 In dtInputExcelSheet2.Select()
                     Where    (From rowSheet1 In dtInputExcelSheet1.Select()
                               Where CInt(rowSheet1("PersonId").ToString).Equals(CInt(rowSheet2("PersonId").ToString)) And
                              (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                               Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                               Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                               Select rowSheet1).Count>0
                    Select rowSheet2)).CopyToDatatable()

3)Use Join clause and Add to datatable

(Comparatively takes less time as we have only one main query and chance of neglecting non-matching PersonId column values) [But needs a careful supervision ]

String strvalues = String.Join(vblf,
                    (From rowSheet1 In dtInputExcelSheet1.Select()
                     Join rowSheet2 In dtInputExcelSheet2.Select() On  rowSheet1("PersonId").ToString Equals 
                     (rowSheet2("PersonId").ToString)
                     Where          (Math.Abs(CInt(rowSheet1("Tax1").ToString)-CInt(rowSheet2("Tax1").ToString))>500 Or
                                     Math.Abs(CInt(rowSheet1("Tax2").ToString)-CInt(rowSheet2("Tax2").ToString))>500 Or
                                     Math.Abs(CInt(rowSheet1("Tax3").ToString)-CInt(rowSheet2("Tax3").ToString))>500)
                     Select String.Join(",",rowSheet1.ItemArray.Select(Function(x) 
                     x.ToString.Trim))+vblf+String.Join(",",rowSheet2.ItemArray.Select(Function(x) 
                     x.ToString.Trim))))

Above returns only values to add the header and append to the output of the string use the below query,

String strColumnHeaders = String.Join(",",(From col In dtInputExcelSheet1.Columns.Cast(Of System.Data.DataColumn)
                          Select CStr(col.ColumnName).Trim))

Manipulate the string - as the columnValues are separated by “,” and rows separated by vblf[Environment.NewLine], build a datatable and add the values by using add data row

4)Traditional For Each row with IF Condition

For each row and place a IF condition and add it to new datatable

Lets too wait for others response… Expecting some other shortcuts as well as I am in pace of exploring LINQ.