Compare one whole column with another column and count the occurences

Hi team,

I have some 10 excel files

Problem Statement: I need to take the 1st excel 1st column say (“Column1”) and check the columns in the next 9 excel files and i have to check the occurrences in all the columns and i have to find which columns in those 9 excel files has more matched values with (Column 1) of 1st column of 1st excel file.

Guys could you please advise me the best approach!!!to solve this other than using for each,for each row…

Regards,
Sriram

Use for each row for 1st excel
assign variable for count say count=0
in that use for each row for second excel
for 1st iteration say row(0).tostring=number1
if
Row1(0).tostring=row(0).tostring
then assign
a=a+1
at the end of iteration i.e. out of second iteration
write cell just in front of that value of a

Hello @ImPratham45

I think you skipped this before the reply…

And also, In every excel contains lots of data more than 20 columns and 200 rows…This approach will not work efficiently…

Vlookup will be effective for this use case???

regards,
Sriram

Hi @Sriram07,

Here I have used a for loop just to loop all the excel files, rest of the things done using linq. Code not tested since I dont have such excel files now. check this and let me know if you need any info.

CheckListContainsList.xaml (10.9 KB)

Thanks @sarathi125 !!! Sure, I will give a try and reach you out for any queries…

Regards,
Sriram

@sarathi125 your linq expression is very good. Before that make me clear that Activity “Read column” will read all the columns of the given excel file and iterate individually all the columns with the other excel files???orelse

We need to use for each and input master column and i have to iterate all the columns one by one???

Because i want to check each column one by one with the rest…

Could you advise me this activity??please…

Regards,
Sriram

@Sriram07,

It will get the first column in each excel.

Do you need to compare all the columns from one excel to all other excels?
Like excel1 - Column1 → Compare with excel2- Column1, Column2, Column3… Column N?

1 Like

Yeah brother exactly…

@sarathi125 Bro and also another issue ,

i am reading tab delimited file and using generate datatable activity with result of datatype datatable.
But read column need excel application scope right,But i have only datatable how can i proceed?

How can i do this brother???

CheckListContainsList.xaml (14.5 KB)

Check this one, it has the full columns to check with other columns.

1 Like

@sarathi125 How can i solve this???

Excel application scope not required, you can use Read Range from System - File - WorkBook - Read Range. So you can use the datatable with my second xaml which I shared.

1 Like

@sarathi125
In this linq statement…
New KeyValuePair(Of String, Integer)(excelFile.Name +“-”+ item.ColumnName, dtChild.AsEnumerable().[Select](Function(r) r.Field(Of Object)(item.ColumnName)).ToArray().Intersect(masterCol).Count)

mastercol is a ienumerable , which is declared but it doesn’t contain any values to intersect with other columns??

What will be stored in the mastercol variable???

If i run, it prompts the error…

Assign: Value cannot be null.Parameter name: second

Regards,
Sriram

CheckListContainsList.xaml (15.4 KB)

Sorry, in the second requirement missed that. Now included it check and let me know if any issues

1 Like

Hello @sarathi125 Sorry for the late reply i need your help must!!!

MasterCol= dtMaster.AsEnumerable().[Select](Function(r) r.Field(Of Object)(column.ColumnName)).ToArray()

MatchingValue= New KeyValuePair(Of String, Integer)(excelFile.Name +“-”+ item.ColumnName, dtChild.AsEnumerable().[Select](Function(r) r.Field(Of Object)(item.ColumnName)).ToArray().Intersect(masterCol).Count)

In this both expression you are comparing only the column names right??? If my understanding is correct?

So actually i dont want to compare the column names i need to compare whole column values of one excel to other and if its matched then only i nedd column name of the matched column…

Sorry if i didn’t explained properly…Could you advise me please…this eating my head…

Regardsw,
Sriram

Not comparing column names alone, using column names to check the values in that columns.

Are you getting the expected output or not.

No @sarathi125

Error message is The key is already present it cannot be added…

Because columnName may be Same for both columns in both excels…

I that key is item.columnName right , Column name might be same in both excels so it was trying to add again column name as key so that this error is popping out i guess !!!

Since we are looping with master datatable and for each excel sheet, that does not matter if the columns repeated for other excels.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.