Excel Cell values not being checked correctly due to containing special characters?

I have an excel file with Multiple Columns. I am using all columns except the first two to find duplicate and distinct rows and then outputting a new excel file with two sheets one for the distinct single data, and one with duplicate data.

My file runs but not correctly. The outputted file has less rows then the original because certain excel cells get ignored. I think it’s due to them containing special characters but I am not sure.

In the replies I have posted an example with input and desired results vs actual results.

This is my Flow:

stringsplit:
row("1").ToString.Split("|"c)

NoDupKeys:
(From r In dtOrigin.AsEnumerable()
Select C1 = r(2).ToString.Trim, C2 = r(3).ToString.Trim, C3 = r(4).ToString.Trim, C4 = r(5).ToString.Trim, C5 = r(6).ToString.Trim, C6 = r(7).ToString.Trim, C7 = r(8).ToString.Trim, C8 = r(9).ToString.Trim, C9 = r(10).ToString.Trim, C10 = r(11).ToString.Trim
Group By C1, C2, C3, C4, C5, C6, C7, C8, C9, C10 Into Group
Select C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, Count = Group.Count
Where Count =1
Select New String() {C1, C2, C3, C4, C5, C6, C7, C8, C9, C10}).ToList

DupKeys:
(From r In dtOrigin.AsEnumerable()
Select C1 = r(2).ToString.Trim, C2 = r(3).ToString.Trim, C3 = r(4).ToString.Trim, C4 = r(5).ToString.Trim, C5 = r(6).ToString.Trim, C6 = r(7).ToString.Trim, C7 = r(8).ToString.Trim, C8 = r(9).ToString.Trim, C9 = r(10).ToString.Trim, C10 = r(11).ToString.Trim
Group By C1, C2, C3, C4, C5, C6, C7, C8, C9, C10 Into Group
Select C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, Count = Group.Count
Where Count =1
Select New String() {C1, C2, C3, C4, C5, C6, C7, C8, C9, C10}).ToList

dtDupKeys:
(From k In DupKeys
Join r In dtOrigin
On k.ElementAt(0) Equals r(2).toString And k.ElementAt(1) Equals r(3).toString And k.ElementAt(2) Equals r(4).toString And k.ElementAt(3) Equals r(5).toString And k.ElementAt(4) Equals r(6).toString And k.ElementAt(5) Equals r(7).toString And k.ElementAt(6) Equals r(8).toString And k.ElementAt(7) Equals r(9).toString And k.ElementAt(8) Equals r(10).toString And k.ElementAt(9) Equals r(11).toString
Select r).CopyToDataTable

dtNonDupKeys
(From k In NonDupKeys
Join r In dtOrigin
On k.ElementAt(0) Equals r(2).toString And k.ElementAt(1) Equals r(3).toString And k.ElementAt(2) Equals r(4).toString And k.ElementAt(3) Equals r(5).toString And k.ElementAt(4) Equals r(6).toString And k.ElementAt(5) Equals r(7).toString And k.ElementAt(6) Equals r(8).toString And k.ElementAt(7) Equals r(9).toString And k.ElementAt(8) Equals r(10).toString And k.ElementAt(9) Equals r(11).toString
Select r).CopyToDataTable

This is the Input:

This is how the Output should be for the distinct data:

This is how it should be for Duplicate data:

This is what I get

!

And this is what I get for duplicate

@struggling_student
give a try on rewrite the LINQ

finding the non dup row:
drNonDups =

(From r In dtOrigin.AsEnumerable()
Let ra1 = r.ItemArray.Skip(1).Take(10).toArray
let check = dtOrigin.AsEnumerable.Where(Function (x) x.ItemArray.Skip(1).Take(10).toArray.SequenceEqual(ra1)).Count
Where check = 1
Select r).toList

the Dup rows we can identifiy by set operation Except

dtOrigin.AsEnumerable.Except(drNonDups, DataRowComparer.Default).toList

Sorry I don’t understand. Should I assign
NoDupKeys =
dtOrigin.AsEnumerable.Except(drNonDups, DataRowComparer.Default).toList

and then

dtNonDupKeys=
(From r In dtOrigin.AsEnumerable()
Let ra1 = r.ItemArray.Skip(1).Take(10).toArray
let check = dtOrigin.AsEnumerable.Where(Function (x) x.ItemArray.Skip(1).Take(10).toArray.SequenceEqual(ra1)).Count
Where check = 1
Select r).toList

Create a variable - Name: drNonDups | DataType: List(Of DataRow)
Assign Activity:
Left side: drNonDups
Right side:

(From r In dtOrigin.AsEnumerable()
Let ra1 = r.ItemArray.Skip(1).Take(10).toArray
let check = dtOrigin.AsEnumerable.Where(Function (x) x.ItemArray.Skip(1).Take(10).toArray.SequenceEqual(ra1)).Count
Where check = 1
Select r).toList

for the duplicated rows we can afterwards detect by:
Assign Activity:
left side: drDups | Datatype: List(of DataRow)
right side:
dtOrigin.AsEnumerable.Except(drNonDups, DataRowComparer.Default).toList

Thank you for the clarity! I am not sure how to change datarows to datatable however.
I just got permission to upload so here is my main file so far if that is better.
try.xaml (19.7 KB)

if the list of datarows is not empty then we can just use a copytodatatable like it is done here in the pattern:

eg. drDups.Count > 0
drDups.CopyToDataTable

I tried this but I am only getting duplicates in both sheets.

try.xaml (25.4 KB)
replic.xlsx (11.6 KB) Result.xlsx (15.4 KB)

@struggling_student - because instead of writing drnondups.copytodatatable you wrote again as drdups.copytodatatable…that’s why…Please see below.

Please fix this and retry again…

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.