Generate data table with row index of duplicate rows in table and number of duplicates in table

Hello,

I have two tables, table a and table b. The two tables have some duplicates rows. I would like to know two things:

  1. the row index of the duplicate in table b
  2. number of duplicates in table a

I would then like to generate a data table with the above information

Cheers

please share some samples of the input data along with the expected output sample. Thanks

1 Like

@E.T.S

Please provide input and output for better reference

1 Like

Hello,

Thank you for your reply.

TableA:

TableB:

Number of duplicate rows in table A is 1

The row index in table b for that duplicate is row index 1 (row 7) - could also be row index 3 (row 9), it doesn’t matter which row the bot picks up as long as it picks up a duplicate

add 1 and 7 to data table

Keep doing this for each row in table a

with this information (first occurence is fine) we could do:

Prepare Report Table - dtReport:
Assign Activity:
dtReport = dtA.Clone

Add DataColumn: ColName: Index, Int32 - dtReport

Assign Activity:
dtReport =

(From d in dtA.AsEnumerable()
Let ia = d.ItemArray
Let idx = dtB.AsEnumerable.ToList.FindIndex(Function (d2) d2.ItemArray.SequenceEqual(ia))
Where idx > -1
Let ra = ia.Append(idx).toArray
Select r = dtReport.Rows.add(ra)).CopyToDataTable

We are checking the first occurence of a dtA row in dtB
And writing within the the report: the values + the 0-Based index

SequenceEqual will work when the structure and DataTypes of dtA and dtB are the same

when all index information is used we would modify the above LINQ by:

Assign Activity:
MarkedRows | DataType: List(Of Tuple(Of int32, DataRow)) =
dtB.AsEnumerable().Select(Function (x,i) Tuple.Create(i,x)).ToList

Assign Activity:
dtReport =

(From d in dtA.AsEnumerable()
Let ia = d.ItemArray
Let tl = MarkedRows.Where(Function (t) t.Item2.ItemArray.SequenceEqual(ia))
From tf in tl
Let ra = ia.Append(tf.Item1).toArray
Select r = dtReport.Rows.add(ra)).CopyToDataTable

We would also highlight that this use case can also be done with few other options as alternates

Apologies I am unable to find the data type List(Of Tuple(of int32, DataRow)

grafik
grafik
grafik
grafik
grafik
grafik

then confirm with OKs

I don’t seem to find that option:

grafik

When basics on this are needed check out also the UiPath Academy for this topic

When I click List I do not have the option of setting two variable types only one:

You do have
grafik

Have set the variable type however I am getting this error message:

Just make a small break and then redo it again.

the resulting DataType will be:


in short: List(Of Tuple(Of int32, DataRow))

I’ve managed to run that but don’t quite understand the output

just answer our request and share with us the clearly defined expected output sample to your above-given input samples.

Apologies - please see below:

TableA:

TableB:

There is one duplicate (row 7 ) in tableA that is in tableB - in tableB this duplicate is located in row index 1 (row 7)

So the data table/list will be:

image

Note the row index is the position of the duplicate row in tableB and the table starts at row 7

This number is then fed into the remove row activity - note the reason I am using the remove row activity and not just removing the duplicates is because the number of duplicate rows in tableA is proportional to the number of duplicate rows that need to be removed in tableB

Another example with more detail of the desired output:
TableA:

TableB:

Here there are two duplicate rows present in tableA and tableB

So the output would be:

image

^ this is the row index of the duplicating rows in tableB

This will then be fed into the delete row activity:

image

Cheers

It looks a little bit like a XY Problem and that not the index is needed but a dedicated handling of

  • detecting duplicated rows
  • deleting rows

But as in Table A we can also have Duplicates we have to more clear specify the rules.

Thank you for your reply,

Please could you clarify