Comparing data table header column

Hi Experts,
Need help. Can someone help me on this.
I have loaded an excel to a data table. Want to identify duplicate and non-duplicated header columns.
below is my sample;
I want to compare the below highlighted data from a data table, and put into two different tables with all other data. one with duplicate heading and the other one with non-duplicates

image

Thank you

@Rounak_Kumar1

@Shanika_Perera
Variables:

we assume that read range with AddHeaders ticked OFF (along with the demerged cells) will create a datatable with auto generated header names similar like this:

grafik

Helpers:
we defined an array with col indexes not to use for the checks: arrBlockIDX variable

and we create some helper (used later)
grafik

we create an Info on the columns for check without the blocked cols:
grafik

we calculate the duplicates and non duplicates:
grafik
grafik

then we prepare String Arrays for the both cases later use for the datatble col filters / removal
grafik

And remove/filter to the datatables
grafik

Find starter help here:
DuplicatedColInfo_SplitUnique-Dup.V2.xaml (13.6 KB)

And some introduction to LINQ here:
[HowTo] LINQ (VB.Net) Learning Catalogue - Help / Something Else - UiPath Community Forum

1 Like

Is there any other simple way? the Names (AA, AB, aaaa ababab) are not fixed and the highlighted line is dynamic. in some files can have 5 and inanother file can have 2 etc.?

We will think about

it is dynamic related to the number columns etc.

But maybe you can elaborate more on it. Especially on how the “Duplicate” is specified

Duplicates needs to be checked on the highlighted. not with number of columns but the value in the highlighted column as I have given. What I meant by dynamic is if we get two excel files one file can have AA,AB,AC in highlighted place and in another file can have AA,AB,AC,AD,AC etc

So our solution should work for any file that we receive
image

was implemented in that way and is dynamic and not hardcoded

implementation is handling this as well and can process file1 and/or file2

Have a look at this alternate approach based on a strategy done with dictionaries
Part 1:

  • loop over second datarow values
  • ommit checks for cols defined in the BlockList
  • if the value is known, then mark it in dict2, else add it to dict1 along with the columnname

Part2:
grafik

  • prepare arr with the colnames form the blocked columns
  • fetch the colnames of duplicated col from dict2
  • calculate the non-duplicated colnames taken from dict1

grafik

Extract the columnsets to datatable

input/output:
grafik

Find starter help here:
DuplicatedColInfo_SplitUnique-Dup-DictApproach.xaml (15.9 KB)

1 Like