I’m trying to join two datatables which have the same exact columns, have matching rows, but have unique cell values.
Table 1:
ID | Name | Ticket
1 | John |
2 | Peter | 205
3 | Mary |
Table 2:
ID | Name | Ticket
1 | John | 301
2 | Peter |
3 | Mary | 564
When I attempted an inner join with matching ID, I got:
ID | Name | Ticket | ID_1 | Name_1 | Ticket_1
1 | John | 301 | 1 | John |
2 | Peter | | 2 | Peter | 205
3 | Mary | 564 | 3 | Mary |
When I attempted to merge the datatables, I just got additional rows.
Is there a way to actually join two data tables without duplicating columns, or duplicating rows?
The end result I’m looking for is:
ID | Name | Ticket
1 | John | 301
2 | Peter | 205
3 | Mary | 564
When the above Task using Join Datatables Activity satisfies the requirement, we use Filtering options/Filter Datatable Activity for removing the Columns that would generate as duplicate keys.
Another approach would be to use Linq Join, but in here we would be preparing the datatable by selecting which columns to be preserved for the Output. A Video Explanation is shown below :
If you do want to raise this as a Feature Request, you could also do this as a Feedback.
How do you use the Filter activity without removing data? It’s okay if I remove Column ID_1 but I can’t remove Ticket_1 without losing data. Plus, I want the ‘Ticket’ data to be on the Ticket column, not on ‘Ticket_1’
@sidb
With one of my work fellows @kumar.varun2 we developed an approach
we merged all dt into one
grouped on the ID
created from each column from the groupmembers a helper tuple
filtered out the empty value ones
driven by the data column schema we build the itemarray where missing ones were creared with empty vlaues
added the merged row to an result datatable
Varun will continue on this approach with details and samples
we were also able to extend this approach to merge multiple datatable and implemented some rules where to pick when more datatables has the value for a particular column.
The motivation about this was:
the case is not a merge only usage - as the rows are not merged within one row
it is not a side-by-side merge
the data joins were not appropriate enough and satisfying
we looked for an approach handling multi datatable merges
(
From row In dt.AsEnumerable()
Group row By k = row("ID").ToString.Trim
Into grp=Group
Let lt = grp.Select(Function(g) arrColSet.Select(Function(cn) Tuple.Create(cn, g(cn)))).SelectMany(Function(t) t)
Let ltf = lt.Where(Function(t) Not (IsNothing(t.Item2) OrElse String.IsNullOrEmpty(t.Item2.ToString.Trim)))
Let ra= arrColSet.Select(Function(c) ltf.Where(Function(t) c.Equals(t.Item1)).DefaultIfEmpty(New Tuple(Of String, Object)(c, Nothing)).First().Item2).ToArray
Select r = dtResult.Rows.Add(ra)
).CopyToDataTable