— Topic: Processing Transaction Items as DataRow types using Queues —
(I don’t believe this has been talked about enough.)
What should be the most effective way to dispatch, get, execute, and output the results for process items, when the item belongs to a source of DataTable type with various columns?
Prepare yourself for a longer-than-ideal read…
- Effective processing of DataRows and maintained as DataRows throughout
- DataRows can have many columns
- Dynamically use the code for multiple projects without changing the code
The “easy-mode” method presumably would have been to use the DataRow type as ItemInformation when using Add Queue Item. This would have kept the item as an object associated with the source DataTable or Array of DataRows object, which could also have been included into the ItemInformation so the DataRow could be processed alongside its source (because if a process gets a queue item from a different source, then the DataRow would not know which data it initially belonged to).
I use the words “would have been” and “could have been” obviously since this “easy-mode” method currently is not possible, because ItemInformation only works with JSON value types, such as Strings, Numbers, Dates, and Arrays.
Basically, it will return an error saying that the JSON object type could not be determined for a DataRow. See below for example of the error message:
Add Queue Item - New - DataRow: Could not determine JSON object type for type System.Data.DataRow.
Working toward a solution (or is it a workaround?):
Therefore, we need to convert the value to a JSON object, which can be done under the namespace
Newtonsoft.Json by using
JsonConvert.SerializeObject() - with the full method it would be
Newtonsoft.Json.JsonConvert.SerializeObject(). Then, to convert it back you can use
JsonConvert.DeserializeObject(). See the below post by @megharajky for an example .xaml and further discussion:
How to add a System.Data.DataRow variable into a work queue?
My own method was to take the transaction item within a For each loop (as part of the dispatcher) and use .net to formulate it to a new DataTable type. Here is my sample code:
Newtonsoft.Json.JsonConvert.SerializeObject(in_TransactionData.Where(Function(r) r Is item).Select(Function(r) CType(r, System.Data.DataRow)).CopyToDataTable)
This will essentially place the data in a JSON format into the ItemInformation, so it keeps the DataTable structure as an Array and looks like this:
However, there are some problems to consider…
- How do we associate this data to its original source, if the source needs to be updated?
- Maybe we can add the filepath to the source and read it again for each item executed?
- Can we append the results of the execution to a new file and forget the original source?
- When and how do we deserialize the object “only” when the item is of a DataTable?
These are some things that make this more challenging, because there are times when an associate will require that the source data be updated rather than in a new file. Doing so means that you would need to match the row item up with the row contained in the source data, and if it can’t find the row, then append to the data.
Here is an idea for some .net to match up and find the row item in the source data:
srcMatched = srcData.AsEnumerable.Where(Function(srcR) String.Join("",srcR.ItemArray) = String.Join("",in_TransactionItem.ItemArray) ).ToArray
And, to update the status, you can do something like this:
If srcMatched.Count > 0 Then Assign srcMatched(0)("Status") = "["+Now.ToString+"] Complete" Else Invoke Method: ImportRow
So, if this is the solution, it can be done with some additional coding. And, I feel this is important to do in many cases, because it helps the user determine which items were executed and which items were not.
If you choose to create a new results file, you won’t necessarily have all the source data, so the user won’t be able to determine if everything was executed without looking at the source data too.
I suppose it depends on the requirements of the process.
Let’s get back to how should we effectively manage queue items, specifically for DataRows…
You will be golden, if you can dispatch the transaction items as part of the same Framework and Job run. So, let’s assume you have this part working.
—The steps required for this (by my own knowledge):
- Initialize complete data to be added to the queue
- For DataRow items, serialize the DataRow/DataTable object as you add the queue item
- Get Transaction Item to execute
- After, if needed, deserialize the object back to a DataRow
- Execute the Process for the item as normal
- Output updated results to existing source data or a separate results file
- (optional) Update Output for the item in the queue
Remember that you should be able to use various dataset types with your projects and not need to manipulate your code each time; this will provide less training, documentation, and make project deployment easier. If your project is processing an array of DateTimes or Strings, then it should be able to be executed in the same way as if your project is processing an array of DataRows without changing the code. - I find using an Array of Object type useful for this, and then, check the type of the item before adding it to the queue so the correct ItemInformation can be used.
Posting this has helped with brainstorming some ideas, and hopefully, it will help others expand on what they are trying to do with Queues as well.
In the future, it would benefit RPA implementation if DataTable processing was more effectively integrated with Queues.
Please feel free to exchange some other ideas that have not been considered by myself or others. I originally thought to post this topic to find out if anyone has some comments or reactions to how DataRows are being executed through Queues.