Comparing two Datatable with Several Condition using LINQ

I have a question, I have more than 5000 data that I took from the website and I have to filter, I am trying to use Linq, but I have a problem, here is an example of the table I have:

image

so I have to look at each column in datatable 1 and datatable 2, if there is a difference in the value of one of the columns, then that column will be retrieved.

I hope anyone can help me. thank you very much.

1 Like

Sure, I can help you with that. There are two main approaches to achieve this filtering in UiPath: using the Filter DataTable activity or leveraging LINQ with the DataTable.Select method.

1. Filter DataTable Activity:

This is a visual approach that allows you to configure the filtering conditions without writing code.

Here’s how you can use it:

  1. Add Filter DataTable Activity: Drag the activity into your workflow.
  2. Define Input:
    • DataTable: Select the DataTable containing the data from the website (Table 1 in your image).
  3. Define Conditions:
    • Click on the "" button next to the Filter Condition property.
    • In the Filter Builder window, define the condition. You can use logical operators (AND, OR) to combine multiple conditions across columns. In your case, the condition would be something like [Column 1] <> DataTable2.[Column 1] or [Column 2] <> DataTable2.[Column 2], etc., depending on which columns you want to compare. You can select the corresponding columns from the dropdown menus.
  4. Define Output:
    • FilteredDataTable: Create a new DataTable variable to store the filtered results.

2. LINQ with DataTable.Select Method:

This approach uses code for more flexibility in defining complex filtering logic.

Here’s how you can achieve it:

  1. Import DataTables: Assuming you have already read the data from the website into a DataTable named dt1 and the other DataTable is named dt2.

  2. Write LINQ Expression:
    Use the Select method with a Lambda expression to filter the data. Here’s an example:

    var filteredData = dt1.AsEnumerable()
        .Where(row1 => dt2.AsEnumerable().Any(row2 => 
            row1["Column1"].ToString() != row2["Column1"].ToString() ||
            row1["Column2"].ToString() != row2["Column2"].ToString() ||
            // Add more column comparisons as needed
        ))
        .CopyToDataTable();
    

    This code snippet iterates through each row in dt1 and checks if there’s any corresponding row in dt2 where any of the specified columns (Column1 and Column2 in this example) differ in value. If there’s a difference, the row is included in the filtered results stored in the filteredData DataTable variable.

Choosing the Right Approach:

  • If you’re comfortable with writing code and need more control over the filtering logic, LINQ with DataTable.Select might be preferable.
  • If you prefer a visual approach for configuring filters, the Filter DataTable activity is a good choice.

Both methods achieve the same result of filtering the data from Table 1 (website data) based on whether there are any differences in column values compared to the corresponding columns in Table 2. Choose the method that best suits your comfort level and coding expertise.

Hi @7d9d39b290c0bbfffea7c0d9e ,

Please elaborate more about the output table.

I will provide a linq .

Thanks

You need to create a custom code or you need to push this data to the database to validate the column data with multiple tables.

Below are the same sample of example you can try anything where feel more confidant

Here’s the C#, VB.NET, and SQL code for comparing two DataTables with several conditions using LINQ:

C#:

public static DataTable CompareDataTables(DataTable table1, DataTable table2, IEnumerable<string> compareColumns)
{
    var results = from row1 in table1.AsEnumerable()
                  join row2 in table2.AsEnumerable() on compareColumns.Select(col => row1[col]).SequenceEqual(compareColumns.Select(col => row2[col]))
                  select row1;

    DataTable resultTable = new DataTable();
    resultTable.Columns.AddRange(table1.Columns.Cast<DataColumn>().ToArray());

    foreach (DataRow row in results)
    {
        resultTable.Rows.Add(row.ItemArray);
    }

    return resultTable;
}

Explanation:

  1. This code defines a function CompareDataTables that takes three arguments:
    • table1: The first DataTable.
    • table2: The second DataTable.
    • compareColumns: An IEnumerable of strings representing the column names used for comparison.
  2. It uses a LINQ join to compare rows from both DataTables.
    • The join condition uses SequenceEqual to check if the values in the specified columns for both rows are equal.
  3. The code then creates a new DataTable (resultTable) with the same columns as table1.
  4. It iterates through the resulting rows and adds them to the resultTable.
  5. Finally, the function returns the resultTable containing rows from table1 that have matching rows in table2 based on the specified columns.

VB.NET:

Public Function CompareDataTables(table1 As DataTable, table2 As DataTable, compareColumns As IEnumerable(Of String)) As DataTable
    Dim results = From row1 In table1.AsEnumerable()
                 Join row2 In table2.AsEnumerable() On compareColumns.Select(Function(col) row1(col)).SequenceEqual(compareColumns.Select(Function(col) row2(col)))
                 Select row1

    Dim resultTable As New DataTable
    resultTable.Columns.AddRange(table1.Columns.Cast(Of DataColumn)().ToArray())

    For Each row In results
        resultTable.Rows.Add(row.ItemArray)
    Next

    Return resultTable
End Function

Explanation (VB.NET):

  • The code structure and logic are similar to C#.
  • The syntax for LINQ expressions and functions might differ slightly between languages.

SQL (Assuming tables named table1 and table2):

SELECT t1.*
FROM table1 AS t1
INNER JOIN table2 AS t2 ON 
  (t1.Column1 = t2.Column1 AND 
   t1.Column2 = t2.Column2 AND 
   ... /* Add additional comparison conditions for other columns */)
WHERE 
  (t1.Column1, t1.Column2, ...) IN (
    SELECT Column1, Column2, ... 
    FROM table2
    GROUP BY Column1, Column2, ...
    HAVING COUNT(*) > 1
  );

Explanation (SQL):

  1. This code uses an INNER JOIN to compare rows from both tables based on specified columns.
  2. You need to replace Column1, Column2, etc. with the actual column names used for comparison.
  3. The subquery within the WHERE clause identifies rows in table2 that appear more than once (duplicates).
  4. The main query then selects rows from table1 that have matching rows in table2 based on the comparison columns.

Note:

  • The SQL code assumes you want to identify rows in table1 that have corresponding duplicates in table2.
  • You can modify the conditions in the JOIN and WHERE clauses to achieve different comparison scenarios.