Update values datatable using linq

I have two datatable, lets call it dtA and dtB. I want to know how can we do things as below:

  • to update values in column dtA from values in dtB if unique key is found in dtB
  • to update values column in dtA if found duplicate values of unique key in dtA itself (without remove duplicate rows in dtA)

Sample scenario as below:
dtA
ID | Name | Country | Duplicate |
1 | John | | |
2 | Peter | | |
3 | Mary | | |
4 | Paul | | |
5 | Logan | | |
6 | Mary | | |
7 | Stacy | | |

dtB
Number | Given Name | Country |
1 | John | USA |
2 | Peter | UK |
3 | Mary | JPN |

OutputDT
ID | Name | Country | Duplicate |
1 | John | USA | |
2 | Peter | UK | |
3 | Mary | JPN | Duplicate |
4 | Paul | | |
5 | Logan | | |
6 | Mary | JPN | Duplicate |
7 | Stacy | | |

Assuming the unique key is Name. In outputDT, column Country is updated by values in dtB, column Duplicate is by checking duplicate entries in dtA

(The number of rows is quite big which is 800k and time consuming if using for each row)

To efficiently update values in one DataTable (dtA ) from another (dtB ) based on a unique key, and to mark duplicates within dtA , you can utilize LINQ queries in UiPath. This approach is more performant, especially with large datasets like 800k rows. Here’s how to do it in a single, comprehensive solution:

Step 1: Update dtA from dtB Where Unique Key is Found

Dim updatedDtA As DataTable = (From a In dtA.AsEnumerable()
Join b In dtB.AsEnumerable()
On a.Field(Of String)(“ID”) Equals b.Field(Of String)(“ID”)
Select dtA.LoadDataRow(New Object() {
a.Field(Of String)(“ID”),
b.Field(Of String)(“Name”),
b.Field(Of String)(“Country”),
a.Field(Of String)(“Duplicate”)
}, False)).CopyToDataTable()

Step 2: Mark Duplicates Within dtA

Dim duplicates = updatedDtA.AsEnumerable().GroupBy(Function(r) r.Field(Of String)(“ID”)).Where(Function(g) g.Count() > 1).SelectMany(Function(g) g.ToList())

For Each row In duplicates
row.SetField(“Duplicate”, “Duplicate”)
Next

Implementation Steps in UiPath:

  1. Invoke Code Activity*: Use this activity to run the above VB.NET code.
  • Input Arguments**: dtA and dtB (with the direction set to In/Out for dtA and In for dtB).
  1. Code**: Insert the VB.NET code provided above into the Invoke Code activity. Ensure you adjust any DataTable column names ("ID", "Name", "Country", "Duplicate") as per your actual DataTable schema.
  2. Execution**: Run the workflow. dtA will be updated based on dtB values where the unique key is found, and duplicates within dtA will be marked.

This method combines updating dtA from dtB and marking duplicates within dtA

For step 1: It throws error as follows

image

For the error “ByRef parameter ‘dtA’ cannot be used in a query expression,” within an Invoke Code activity , you can resolve it by:

  1. Avoiding passing the DataTable as a reference within the LINQ query.
  2. Copying the DataTable to a local variable inside the Invoke Code and then performing operations on that local variable.

Here’s a short example:

vbCopy code

Dim localDtA As DataTable = dtA.Copy()
' Use localDtA for your operations

After processing, if needed, copy the changes back to dtA.

To achieve the updates you mentioned without using a loop through each row, you can leverage the Merge method in C# or a similar method in other programming languages if you are using a different one. Below is a sample C# code snippet to illustrate how you can achieve this using DataTables:

using System;
using System.Data;

class Program
{
    static void Main()
    {
        // Sample DataTables
        DataTable dtA = new DataTable();
        dtA.Columns.Add("ID", typeof(int));
        dtA.Columns.Add("Name", typeof(string));
        dtA.Columns.Add("Country", typeof(string));
        dtA.Columns.Add("Duplicate", typeof(string));

        DataTable dtB = new DataTable();
        dtB.Columns.Add("ID", typeof(int));
        dtB.Columns.Add("Name", typeof(string));
        dtB.Columns.Add("Country", typeof(string));

        // Sample data
        dtA.Rows.Add(1, "John", "USA", DBNull.Value);
        dtA.Rows.Add(2, "Peter", "UK", DBNull.Value);
        dtA.Rows.Add(3, "Mary", "JPN", "Duplicate");
        dtA.Rows.Add(4, "Paul", DBNull.Value, DBNull.Value);
        dtA.Rows.Add(5, "Logan", DBNull.Value, DBNull.Value);
        dtA.Rows.Add(6, "Mary", "JPN", "Duplicate");
        dtA.Rows.Add(7, "Stacy", DBNull.Value, DBNull.Value);

        dtB.Rows.Add(1, "John", "USA");
        dtB.Rows.Add(2, "Peter", "UK");
        dtB.Rows.Add(3, "Mary", "JPN");

        // Update values in column dtA from values in dtB if a unique key is found in dtB
        // This is equivalent to an "update" operation in SQL
        dtA.Merge(dtB, true, MissingSchemaAction.Add);

        // Update values in the "Duplicate" column in dtA if duplicate values of the unique key are found in dtA itself
        // Here, we use LINQ to find duplicate rows based on the "ID" column
        var duplicates = dtA.AsEnumerable()
                            .GroupBy(row => row.Field<int>("ID"))
                            .Where(grp => grp.Count() > 1)
                            .SelectMany(grp => grp.Skip(1));

        foreach (var duplicateRow in duplicates)
        {
            duplicateRow["Duplicate"] = "Duplicate";
        }

        // Display the updated DataTable dtA
        DisplayDataTable(dtA);
    }

    static void DisplayDataTable(DataTable dt)
    {
        Console.WriteLine("ID\tName\tCountry\tDuplicate");
        foreach (DataRow row in dt.Rows)
        {
            Console.WriteLine($"{row["ID"]}\t{row["Name"]}\t{row["Country"]}\t{row["Duplicate"]}");
        }
    }
}

This code uses the Merge method to update values from dtB to dtA based on the “ID” column and then uses LINQ to find and mark duplicate rows in dtA based on the “ID” column. Finally, it displays the updated DataTable dtA.

1 Like

Hi @aqiffm

Check the below zip file. Hope this meets your requirement

BlankProcess15.zip (46.8 KB)

Regards

can you provide version of this code? I received this error when run it

Invoke Code: Exception has been thrown by the target of an invocation for step 1 (I have ensure that the argument passed is correct)

Hi @aqiffm

Try this:

Hope this will work

Regards

Can we try to change name of columns for dtB from “ID” to “Number” and “Name” to “Given Name”.

Because I receive this error:
Exception has been thrown by the target of an invocation

Hi @aqiffm

Have you tried executing the attached xaml ?

Regards

tested it but its not what I wanted. Duplicate column in DT A should not been initialized at first. It should be gotten after comparing in DT A itself

@aqiffm

If possible could you share the excel file.

Regards

Input.xlsx (10.9 KB)

This is the sample input

Hi @aqiffm

Please check the below thread. I think your question is similar to this.

Regards