How to check if there are duplicates in a list of strings?

Hello,

I would like to know how to check if there are duplicates in a 1 column list of strings (i dont want to count blank positions).

For instance, this list:
"Column1
43

23

12"
There aren’t duplicates (in spite of existing blank positions)

Thanks a lot!

"Column1
43

23

12"

You can split your string using:
youstringvariable.Split(Environment.NewLine.ToArray, StringSplitOptions.RemoveEmptyEntries)
This will remove empty entries and you will get an array[string].
[“Column1”,“43”,“23”,“12”]

The you can do:
youarray.GroupBy(Function(w) w).OrderByDescending(Function(g) g.Count())
This will give you list of each entry in the array grouped together in descending order.
your case you will get
[“Column1”,“43”,“23”,“12”]

When you do- list(0).count >1 you have a duplicate and
list(0) will give you that item value.

4 Likes

Hi @pal1910

You can check where total number of records in the list are equal to total number of distinct records in the list. Total number of records can be determined by Count property and distinct records can be determined by using Distinct method of List object.

list.Count = list.Distinct().Count()
if true, No duplicates. Otherwise, contains duplicate records.
4 Likes

Just a quick question…

How do i print out the value that is a duplicate using exact your method?

Here is a small example:
2

You can use linq directly to get mismatched records
duplicateList = list.GroupBy(Function(x) x).Where(Function(x) x.Count > 1).Select(Function(x) x.Key).ToList()

or
Get distinct list, distinctList = inputList.Distinct().ToList()
Retrieve all duplicate items from inputList except the distinct once,
duplicateList = inputList.Except(distinctList).ToList()

5 Likes

Can you make a small example with an array strings e.g. “1,2,3,1”?

String Array or String List?

Array.

This outputs the duplicate value BUT if there is more than 1 duplicate value then it only outputs óne of them. So the problem (right now) is that i dont know what syntax to use to output all duplicates in the array.

Because Distinct is used, its returning only one value. Use above code. That returns all occurrences of duplicate instances

1 Like

I created a new variable of type list and assigned it to: list.GroupBy(Function(x) x).Where(Function(x) x.Count > 1).Select(Function(x) x.Key).ToList()

Then i looped through my newly created list and boooom… i got all duplicates, thank you :slight_smile:

2 Likes

Please delete my post. it was a repeated question

Hi @Madhavi
Found your suggestion seems matching what I need for checking whether a new value is duplicated in a column…

What i did…
read range > for each row in dt
Assign subDom = row(1).ToString **Variable type: String
Assign ListCol = new List(of String) **Variable type: List String
Add to collection Collection: ListCol item: subDom TypeArgument: String
For each item in ListCol
If ListCol.Count = ListCol.Distinct().Count()
Then Message Box “not duplicated” Else" duplicated"
Write Line Text item.ToString (This activity is just for testing whether I got the right content for ListCol)
Output 1 2 2 4 5 No error No Warning
But Message Box Popup 5 times “not duplicated” which is not correct.
Can you help me what I’m missing?

If your intention is to check whether a column in datatable has duplicate values, you can use a linque query and check this.
dtDataTable.AsEnumerable().Select(function(x) x.Field(of String)(“Column_Name”)).GroupBy(function(x) x).Where(function(g) g.Count() > 1).ToList().Count

But if you have a requirement to loop through to do additional functionalities, then you can use filter datatable activity by passing the column value for the row and check how many resultant rows are retrieved.

1 Like

Hi @Madhavi
What I try to do is if a new entry of username in the new row is duplicated to the other row in column “username”, Then send an email to user to change, Else continue to another task.

I checked the filter datatable activity, but it seems only output “filtered DataTable variable”, not a boolean as I need.

So I tried the first suggestion from you, but not sure what are “x” and “g” for
dt.AsEnumerable().Select(function(x) x.Field(of String)(“username”)).GroupBy(function(x) x).Where(function(g) g.Count() > 1).ToList().Count
But got error Option Strict On not allow from ‘Integer’ to ‘Boolean’…

Thanks for your help.

1 Like

Hi @Madhavi
Finally I tried it out…
Assign Duplicate = dt.AsEnumerable().Select(function(x) x.Field(of String)(“username”)).GroupBy(function(x) x).Where(function(g) g.Count() > 1).ToList().Count
If Duplicate > 0 Then “duplicated” Else “not duplicated”

Thanks for your help.

2 Likes