DataRow type into Queues... Let's discuss!

Topic: Processing Transaction Items as DataRow types using Queues

(I don’t believe this has been talked about enough.)
Intro:
What should be the most effective way to dispatch, get, execute, and output the results for process items, when the item belongs to a source of DataTable type with various columns?

Prepare yourself for a longer-than-ideal read… :thinking::shushing_face:

Reasons:

  1. Effective processing of DataRows and maintained as DataRows throughout
  2. DataRows can have many columns
  3. Dynamically use the code for multiple projects without changing the code

Initial Assumption:
The “easy-mode” method presumably would have been to use the DataRow type as ItemInformation when using Add Queue Item. This would have kept the item as an object associated with the source DataTable or Array of DataRows object, which could also have been included into the ItemInformation so the DataRow could be processed alongside its source (because if a process gets a queue item from a different source, then the DataRow would not know which data it initially belonged to).
image

I use the words “would have been” and “could have been” obviously since this “easy-mode” method currently is not possible, because ItemInformation only works with JSON value types, such as Strings, Numbers, Dates, and Arrays.

Basically, it will return an error saying that the JSON object type could not be determined for a DataRow. See below for example of the error message:

Add Queue Item - New - DataRow: Could not determine JSON object type for type System.Data.DataRow.

Working toward a solution (or is it a workaround?):
Therefore, we need to convert the value to a JSON object, which can be done under the namespace Newtonsoft.Json by using JsonConvert.SerializeObject() - with the full method it would be Newtonsoft.Json.JsonConvert.SerializeObject(). Then, to convert it back you can use JsonConvert.DeserializeObject(). See the below post by @megharajky for an example .xaml and further discussion:
How to add a System.Data.DataRow variable into a work queue? - #11 by megharajky

My own method was to take the transaction item within a For each loop (as part of the dispatcher) and use .net to formulate it to a new DataTable type. Here is my sample code:

Newtonsoft.Json.JsonConvert.SerializeObject(in_TransactionData.Where(Function(r) r Is item).Select(Function(r) CType(r, System.Data.DataRow)).CopyToDataTable)

This will essentially place the data in a JSON format into the ItemInformation, so it keeps the DataTable structure as an Array and looks like this:

[{"Column1":"Value1","Column2":"Value2","Column3":"Value3"}]

However, there are some problems to consider…

The Problems:

  1. How do we associate this data to its original source, if the source needs to be updated?
  2. Maybe we can add the filepath to the source and read it again for each item executed?
  3. Can we append the results of the execution to a new file and forget the original source?
  4. When and how do we deserialize the object “only” when the item is of a DataTable?

These are some things that make this more challenging, because there are times when an associate will require that the source data be updated rather than in a new file. Doing so means that you would need to match the row item up with the row contained in the source data, and if it can’t find the row, then append to the data.

Here is an idea for some .net to match up and find the row item in the source data:

srcMatched = srcData.AsEnumerable.Where(Function(srcR) String.Join("",srcR.ItemArray) = String.Join("",in_TransactionItem.ItemArray) ).ToArray

And, to update the status, you can do something like this:

If srcMatched.Count > 0
Then
    Assign srcMatched(0)("Status") = "["+Now.ToString+"] Complete"
Else
    Invoke Method: ImportRow

So, if this is the solution, it can be done with some additional coding. And, I feel this is important to do in many cases, because it helps the user determine which items were executed and which items were not.

If you choose to create a new results file, you won’t necessarily have all the source data, so the user won’t be able to determine if everything was executed without looking at the source data too.

I suppose it depends on the requirements of the process.

The Framework:
Let’s get back to how should we effectively manage queue items, specifically for DataRows…

You will be golden, if you can dispatch the transaction items as part of the same Framework and Job run. So, let’s assume you have this part working.

—The steps required for this (by my own knowledge):

  1. Initialize complete data to be added to the queue
  2. For DataRow items, serialize the DataRow/DataTable object as you add the queue item
  3. Get Transaction Item to execute
  4. After, if needed, deserialize the object back to a DataRow
  5. Execute the Process for the item as normal
  6. Output updated results to existing source data or a separate results file
  7. (optional) Update Output for the item in the queue

End Goal:
Remember that you should be able to use various dataset types with your projects and not need to manipulate your code each time; this will provide less training, documentation, and make project deployment easier. If your project is processing an array of DateTimes or Strings, then it should be able to be executed in the same way as if your project is processing an array of DataRows without changing the code. - I find using an Array of Object type useful for this, and then, check the type of the item before adding it to the queue so the correct ItemInformation can be used.

Final Statements:
Posting this has helped with brainstorming some ideas, and hopefully, it will help others expand on what they are trying to do with Queues as well.

In the future, it would benefit RPA implementation if DataTable processing was more effectively integrated with Queues.

Discuss:
Please feel free to exchange some other ideas that have not been considered by myself or others. I originally thought to post this topic to find out if anyone has some comments or reactions to how DataRows are being executed through Queues.

Thank yoo!

@ClaytonM

7 Likes

May I shortly resume this idea:

  1. “Add Queue Item” should accept a DataRow as input parameter… (not yet)
  2. However now you have the Bulk Add Queue Items activity (the input is datatable).
    24

For retrieving and manipulating the DataRow in GetTransactions isn’t working with SpecificData the same?

3 Likes

I wanted to add a DataRow as input for a Queue Item recently too. After seeing that it is not as simple as adding the DataRow to ItemInformation, I chose to replicate the DataRow in the performer process with ItemInformation added as regular parameters.

Thanks for the thoughts @ClaytonM! Interesting read.

I’d love to see DataRow as input parameter for “Add Queue Item” in the future :slight_smile:

Cheers, Lukas

3 Likes

That’s cool. I assume that’s in the 2019 version.

Can you rephrase this question? I didn’t understand exactly what you were asking. If that’s related to the Bulk Add Queue Items, I couldn’t say because I’m working in Enterprise 2018 version and have not used it.

I suppose if you could get the row, you would use in_TransactionItem.SpecificData(“dataRow”), and you could use that just like a row and retrieve and manipulate each column as normal. You would also need the DataTable that the row goes to.

But, then, how would you know that multiple Queue Items are of the same DataTable? This will be important, because if you process for example 10 queue items, each of which are DataRows, you would basically be updating each item individually without any communication between the items. Therefore, the DataTable would only update the individual DataRow in it rather than it knowing that other DataRows have been updated throughout the process, and maintain a shared DataTable between the items. - Hopefully that makes sense.

Technically, you could maintain the DataTable if you are using one Robot, but ideally, you would use multiple Robots where the items are processed on different job runs.

I believe, currently, that I will need to Read the existing output file to get the already updated items, then update the currently processed item to the file. That way, each Robot can get the updated DataTable each time it is ready to update the file. And, this is assuming that the Robots will not fight over updating the file at the same time.

Alternatively, I believe I could update the file without getting the updated DataTable first by Appending the row, then removing the duplicate row.

Thanks.

1 Like

Hi there , below average joe when it comes to coding here.
I was looking for a solution for this when I came by this post.
But why not just output the Datatable to a string and use string manipulation( regex for example) to determine the data row sets .
Not sure though how it will handle the empty cells or how to keep the column names :slight_smile: .

Hi @MAmer35
Yeah, I suppose you can take various approaches.
A transaction will most likely be a single Data Row, so I’m assuming you would output the row to a string to be used as the Item Value of the Queue item. You can do this like String.Join(",",row.ItemArray)

However, connecting this data set to the original Data Table becomes a slight challenge. But if you are outputting each data set to a new spreadsheet rather than the original, then you won’t necessarily need to connect to the data table (since you can use Append Range or Text as an example).

Then, additionally, since the data set is a string, you would need to rely on a delimited string to output back to a spreadsheet. And, you would have to read it back if you need to update columns within that data set (cause you would want a data row). Note: you can output a comma-delimited string to CSV file, then read it to get data table.

I normally would not recommend string manipulation in this case. So, you don’t have to do many steps to get a data set you can easily process and update. And, if one of the values has a comma in it, then you need to get quotes around the value, so the CSV can parse the columns. I feel as though it might be complicating the solution.

Regards.

1 Like