Remove Duplicate with condition

Hi All,

I want to remove the duplicate row from column (ABI #) which is having value “N” in Column (Pending).
DTEST.xlsx (8.4 KB)

image

1 Like

Fine
once after getting the datatable named outdt
use this expression in the assign activity
outdt = outdt.Asenumerable().Groupby(Function(a) a.Field(of string)(“yourcolumnname”).ToUpper).Select(Function(b) b.First()).CopyToDatatable()

this will give only the distinct value
Cheers @arivazhagan_mathivan

1 Like

Hi @Palaniyappan,

Thanks for the reply.
im facing the below error

.

and i want to check the condition “Y” or not in Pending column.

1 Like

Fine
for asenumerable issue

Cheers @arivazhagan_mathivan

1 Like

Hi @Palaniyappan,

The AsEnumerable issue resolved thanks.

But im not getting the expected output.
image
I need only the value which is “Y” in Pending column. Please help on this.

1 Like

@arivazhagan_mathivan

Then why you have to check duplicate…
simply filter condition based pending is “Y”

For that use below code

Dt.AsEnumerable().Where(function(x) x(“Pending”).ToString.ToUpper.Equals(“Y”)).CopyToDataTable

In Assign Variable

2 Likes

Hi @amaresan,

image

If you see the screenshot the highlighted “Pink colours” are all duplicate.
In that “Pink colours” rows i need only which is having pending column value as “Y”, and the other rows which not highlighted (unique rows) as well.

@arivazhagan_mathivan

Did you try Above code which is shared my me…

Yes i did, it is not working as expected.

Good morning,

You could do the following:

  1. Read Range
  2. Filter the collection on ABI = N to copy the rows to a new Datatable
  3. Filter the original collection to only include ABI = Y
  4. Use Remove Duplicate Rows on the new dt
    5 Use Merge Datatable to combine your filtered, unique (No) dt with the yes only dt

@arivazhagan_mathivan
It should be possible to do this with LINQ and Except

Let me know If you need help on the statement

Hi @arivazhagan_mathivan
I did reread all the posts and confirm to @amaresan. Based on your second table all N are to remove. And this would be the simplest (e.g. FilterTable or LINQ)

Let me reformulate your requirements, maybe you are looking for some different:

  • If a N row ABI # Value exits in a NON-N row ABI # Value then we got a Duplicate
  • Mission: remove the duplicates (means N rows)

grafik
N-AA exists not in NON-N so keep, Keep Y-DD as well, Keep NON-BB,CC
Remove N-BB, N-CC

So this is possible by following:
Identify the Duplicates (defined as above)
Remove them from Datatable

Duplicate Identification:
Assign Activity:
To: DuplicatedNs of DataType: IEnumerable(Of DataRow)
Value:
(From n In dtSample.AsEnumerable.Where(Function ( r ) r(“Pending”).ToString.Trim.Equals(“N”))
Join y In dtSample.AsEnumerable.Where(Function ( r ) Not r(“Pending”).ToString.Trim.Equals(“N”))
On n(“ABI #”).ToString.Trim Equals y(“ABI #”).ToString.Trim
Select n).AsEnumerable

Removal
Assign Activity
To: dtFiltered of Datatype: DataTable
Value:
dtSample.AsEnumerable.Except(DuplicatedNs).CopyToDataTable

Kindly Note: CopyToDataTable is throwing an exception in case no rows are to keep

Let us know if it is working