Match rows to column name & write to cell

Hi folks! I have a unique question when processing data - see attached workbook and image below: (NOTE: I added the values in dt2 other than column headings).
This solution is reading scraped data that an end-user has labeled and fed into a document intelligence tool.

The goal here is to match the key values in dt1 to the column headings in dt2 and write the corresponding value tied to the key. My current solution it to open a nested for each row loop, and if the key value in dt1 matches the dt2.columns.columnname in the current iteration - write to that cell.

The problem here is that I need to write to dt2 which is a blank datatable with only column headings. When I try to iterate down to the next row to print my matches, it throws an error.

How can I dynamically add data rows to this table? Please note that I already tried to ArrayRow property in the Add Data Row activity but I do NOT want to hard code the fields because they will vary based on the document being processed.

@joshgregory10 Will all the Columns in DT1 be present in DT2 ? If So, you can use Transpose Datatable Activity which might be a Lot easier to Shift the Structure and get the Output you need. Try it and let me know if you face any Difficulties.

Hi there! No, the columns in dt1 will not exist in dt2.

In dt1, the values were extrapolated to dt2 and formed the column names. I call this an ‘unpivot’. Once I did this, I need to loop through dt1 and any time the Key in dt1 matches the column name in dt2, write that value.

Each block of kvps in dt1 are extracted from a document intelligence tool so they need to be contained in one row within dt2 - they cannot be separated.

1 Like

Is there a way to bunch together the identifiers in dt2 into a DataRow and print that DataRow in dt2? Note that I need to be able to have a dynamic number of columns, I cannot hard code the fields.

@joshgregory10 Is it possible for you to send the input Excel Files and the Expected Output File. Will try to implement it if possible.

@joshgregory10
have a look on the yellow Block. currently your structure is varying. Can you just check at your end if this was a copy paste issue or if our data really comes in an unorderd structure.

In general you do sometig like a block parsing and later transpose it to this other form. Blockparsing can be done with markers. For an initial idea hava look to a not so far away case, as block parsing was used

Find below the updated table - I anticipated this issue would occur so I created a field called ‘Identifier’ in dt1 where it would link every kvp to a unique value.

Refer to dt2 - I completed two rows manually. Identifier field from dt1 will form ‘primary key’ of the row, and the remaining kvps will print if the column name matches. This is why I put emphasis on match the column name. The column name will always match the ‘Key’ field in dt1 since the values originated from there.

I was thinking tuples (identifier, key, value) but not sure how to implement this in dt2 to get the output I need without hard coding the name of the columns and predefining the fields in the dT. As mentioned, I need this to be dynamic since users will change the number of keys they extract from the document.

Yes - key/value pairs can be unordered (I am retrieving from Azure Cognitive Services via REST API so the response varies). However, if you see below, I append a document identifier to keep the kvps bunched together to form the row. Just need to make sure it is writing the key to the correct column in dt2.

I uploaded another screenshot…it will not allow me to upload CSV.

@joshgregory10
Let me summarize your requirements:

  • you will get a DataTable with Groups in it
  • each Group has values from a set in form of Key Value Pairs
  • per Group the KVPs can be less as set and in different ordered
  • in a second DataTable the Grouped KVPs should be transposed an to be in the right order

I took this as challenge, sure it is solvable with available activities from UiPath but I never approached this in a LINQ

Demo Table:
grafik
red, blue the colors, yellow marked the differences per group

Result:
grafik

Demo Flow

What it is doing:

  • create a distinct List of all Keys
  • used the distinctKeys List and create datacolumns in a second empty Datatable
  • order as retrieved

Logic:

  • Group the rows by ID (GroupBy)
  • Create from the Grouped Rows a Dictionary of Keys / Values
  • Create From the Key List a dictionary (Value is nothing)
  • Find out with An Except which Keys are Missing for the full DataRow Structure and create a Dictionary with Keys and Empty Values for missing items
  • Merge It with dict A (Now we have a dictionary with KVPs from Group with the values and Filled Up with KVP (empty Values) for the missing ones, but not in the right column order structure
  • Iterate over the Key List, Fetch the Values from unordered Dictionary and create an array from it
  • Use this array for adding a new datarow for this Group
  • copy it to datatable

Linq:
(From d In dtData1.AsEnumerable
Group d By k=d(0).toString Into grp=Group
Let dictA = grp.ToDictionary(Of String, Object)(Function (x) x(1).toString, Function (x) x(2))
Let dictK = Keys.ToDictionary(Of String, Object)(Function (x) x, Function (x) Nothing)
Let dictD = dictK.Keys.Except(dictA.Keys).ToDictionary(Of String, Object)(Function (x) x, Function (x) Nothing)
Let dictM = dictA.Union(dictD).ToDictionary(Of String, Object)(Function (x) x.Key, Function (x) x.Value)
Let arrRow = Keys.Select(Function (i) dictM(i)).toArray
Select newrow = dtData2.Rows.Add(arrRow)
Select newrow).CopyToDataTable

I would not suggest to rely on such a long statement as long you don’t feel that you can maintain it. But you can map some parts to essential UiPAth Activities and deconstruct the statement. Maybe this can help you.

Otherswise we are here for your Questions.

Demo XAML here:
joshgregory10.xaml (9.5 KB)

2 Likes

Hi there - thanks for this! Can we print the ‘ID’ field to the dt2 output as well in the first column? I need to join this to another dT. So the final output in dt2 will be the labels as column headings + the ID field.

1 Like

@joshgregory10
I will have a look on it later after my work. I do feel that we will find a way

Thanks @ppr - I tried to follow the LINQ query to pull it out myself but was unable. If you are using the identifier group the data, wouldn’t it just be a matter of adding that field to the final output in the newRow?

@joshgregory10
Due A Bug in UiPath Studio I lost much time and was blocked to rework on it. However the Statement requires a map to more essential Activities. Otherwise the ability of maintenance will be lost. The LINQ was good for Quick prototyping but doesnt have longer any reason to stay longer.

I will check on another machine about the studio Bug and will revert back to you

Great! I will look for your note about adding the ID to final output in dt2.

@joshgregory10
Find this updated version no handling the ID. The LINQ was except the GroupBy decomposed into essential UiPath Activities. With DataTable.NewRow we got an empty row, that was fillable by accessing with the Key Info from dt1.

Find Demo here
joshgregory10_G2.xaml (13.6 KB)

Thanks for this! I ran this but throwing an error - ‘Add Data Row: Cannot set column ‘ID’. The value violates the MaxLength limit of this column.’ in the last activity. Did you have the same?

@joshgregory10
Go to the add data row Activity and increse the value length according the expected length of your ID Strings plus some buffer

This was great and worked perfect!

One more issue I ran into. The overall project I am working on is extracting data from a JSON file originating from Microsoft Form Recognizer. In order to get the ‘Value’ field used in dt1, I am doing the following:

This loop is iterating through the JSON result page (resultPage.AnalyzeResults) and as shown in ‘Add Data Row’ activity, writing file.ToString (‘ID’), label.Key (‘Key’), label.Value.Text (‘Value’) to dt1.

The problem now is when Form Recognizer fails to find a ‘Value’ during extraction, it writes ‘Null’ in the value property within JSON file. This throws an error (Object reference not set to instance of object) because the value is null in Add Data Row.

**Question: How can I check if an object is null and write a string (ex. “Null”) so that my ‘Add Data Row’ activity will not error? I tried to convert label.Value.ToString.IsEmptyorNull but you cannot do that with objects… **

@joshgregory10
I am Not Sure if i got you. For Testing If a string is null or empty you can use String.isnullorempty(youtstringvar) and will Return true or false.

Can you please give a Feedback If my Demo Code integrated by you IS working as expected and also handle the ID now