Having the 50,000 rows in excel file and wants to check Duplicate values in 2 columns and color it

image

Having the 50,000 rows in excel file and wants to check Duplicate values in 2 columns using Linq, if having the duplicate then Color the Cell value of Subject in same excel File.
In the Above Screenshot Name and Subject are Duplicate with same Values then color the Subject Value.

Hi @Shubham_Arora1

I would rather suggest you to use VBA macros since you have cell color to highlight id Duplicate found.
Consider the below sample Input taken:


Used VBA Code:

Sub HighlightSubjectDuplicates()
    Dim ws As Worksheet
    Dim lastRow As Long
    Dim nameRange As Range, subjectRange As Range
    Dim cell As Range
    Dim dict As Object

    ' Set the worksheet reference (replace "Sheet1" with your actual sheet name)
    Set ws = ThisWorkbook.Sheets("Sheet1")

    ' Assuming data starts from row 2, adjust if needed
    lastRow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row

    ' Set the range for the "Name" and "Subject" columns
    Set nameRange = ws.Range("A2:A" & lastRow)
    Set subjectRange = ws.Range("C2:C" & lastRow)

    ' Create a dictionary to store unique names
    Set dict = CreateObject("Scripting.Dictionary")

    ' Loop through the "Name" column to find duplicates
    For Each cell In nameRange
        If dict.Exists(cell.Value) Then
            ' Duplicate name found, highlight all corresponding "Subject" cells in yellow
            For i = 1 To lastRow - 1
                If nameRange.Cells(i).Value = cell.Value Then
                    subjectRange.Cells(i).Interior.Color = RGB(255, 255, 0) ' Yellow color
                End If
            Next i
        Else
            ' Add the name to the dictionary
            dict.Add cell.Value, 1
        End If
    Next cell
End Sub

Paste the code in a text file and pass the file path in Code file Path and HighlightSubjectDuplicates is the Entry Method Name.
Output:


Note: Code Text file Path is in the project folder shared below
Workflow:

zip file:
BlankProcess23.zip (47.7 KB)

Note: Make sure to enable Macros.

I’m happy to help if you face any difficulties.
Regards

in general we can do

  • groupby and calculate the row indexes of groups with a member count > 1
  • loop over the returned indexes and setting the cell color

From the sample data above we cannot derive surely the criteria of duplicates. Currently it looks like Subject or Name & Subject. So just redefine the requirement for this.

Hi @vrdabberu as Subject should also be same with the Name in excel File.

@Shubham_Arora1

Try the below code with your input excel file and mention if it’s working. I hope it will work if the Name and subject are same it will highlight the subject value.

Regards

sounds like a groupby on Name, Subject

arrIndexes | int32 Array =

(From i In Enumerable.Range(0, dtData.Rows.Count)
Let d = dtData.Rows(i)
Let t = Tuple.Create(i, d(0).toString.ToUpper.Trim, d(2).toString.ToUpper.Trim)
Group t By k1=t.Item2, k2=t.Item3 Into grp=Group
Where grp.Count > 1
From g In grp
Select v=g.Item1).toArray

then loop over arrIndexes
Calculate the Cell Range (index + Offset) - Offset =2, 1 for the 0 based index adaption + 1 for the first Header Line
Set the cell color by using the calculated range

@ppr it’s not working fine, Requirement Name as well Subject should be same then only mark as color.
image

UPDATE: we did a small rework on the LINQ


this we had defined a the groupby criteria

what was modelled in detail at your end?

(From i in (Enumerable.Range(0, dtData.Rows.Count))
Let d = dtData.Rows(i)
Let t = Tuple.Create(i, d(“Name”).toString.ToUpper.Trim, d(“Subject”).toString.ToUpper.Trim)
Group t by k1=t.Item2, k2=t.Item3 into grp=Group
Where grp.Count > 1
From g in grp
Select v=g.Item1).toArray

So regarding the above query when I iterating arrIndexes in For each loop giving all log results starting from 0 to end of row - @ppr

have a look here where we checked the revised LINQ
grafik
A is DUP, so returned index i 0,3
B is not DUP as only first Col is same but not third col

So in this direction we had undestood your case