How to get duplicate rows from column A and associated with those rows get not-duplicate values from column B

I have an excel sheet which has multiple columns, but in the column A there are repeated (duplicate) names, respective to those names we may get different numbers or same numbers
in the column B.
here is the data for your reference:

     Column A                 Column B
       xxx                       1234
       xxx                       1234
       xxx                       3456
       xxx                       5678
       xyz                       2456
       yxz                       7889

So i want to get duplicate names which are not having duplicate numbers. i want to get only duplicate data from column A but not duplicate data from column B associated with column A. if we don’t have not-duplicate data in column B,
the BOT should not extract the data.
Can someone help me how to get this???

Hi @geetanjali.nerlekar

Can you share the output for you requirement. It will be helpful solve the issue

To get the duplicate values from the Column A

(From p in DT.Select() where( From q in DT.Select() where q("Column A").Equals(p("Column A")) Select q).ToArray.Count>1 Select p).ToArray.CopyToDataTable()


Output should be like:
Column A Column B
xxx 3456
xxx 5678
IF column A has duplicate rows and column B also has duplicate rows associated to column A, we should not extract them.
We should only extract When we have duplicate rows in column A and not-dupliacte rows in column B.

HI @geetanjali.nerlekar

Try this expression

(From p in DT.Select() where( From q in DT.Select() where q("Column B").Equals(p("Column B")) Select q).ToArray.Count>1 Select p).ToArray.CopyToDataTable()

Have look on the thread


How about the following expression?

dt = dt.AsEnumerable.GroupBy(Function(r) r("ColumnA").ToString).Where(Function(g) g.Count>1).SelectMany(Function(g) g.GroupBy(Function(r2) r2("ColumnB").ToString).Where(Function(g2) g2.Count=1).SelectMany(Function(g2) g2)).CopyToDataTable()


Hi @Gokul001
I don’t want to remove duplicate rows from column.
I need an expression or Linq query to extract Only if i have same values in column A but different values in column B.
if i have same values in column A and different values in column B then it should not extract or remove.

Names (Column A) values or the datarows?

Can be assumed as

xxx | 3456
xxx | 5678
xyz | 2456
yxz | 7889

xxx | 3456
xxx | 5678
xyz | 2456
yxz | 7889

Variable Result | List(Of DataRow)
Variable: dtResult | DataTable

Assign Activity
LHS: Result

(From d in dtData.AsEnumerable
Group d by k=d(0).toString.Trim into grp=Group
Where grp.Count > 1
From g in Group
Group g by k2=g2(1),toString.Trim into grp2=Group
Where grp2.Count = 1
Select r=grp2.First()).toList

check within an if acitvity
Result.Count > 0
Then: dtResult = Result.CopyToDataTable
Else: dtResult = dtData.Clone

as we can expect also empty results and CopyToDataTable would fail in such case we handle it defensive dependend if we do have a result or not

1 Like

Hi @Yoichi
That expression worked… thankyou so much!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.