Queues usage

Hi,

I have to implement a queue… I have the following questions -

Note that bot is expected to run in 3 VM’s to consume and process queue items…

  1. My input sheet is excel and is required to upload the excel contents to queue

    • Hence is it required to create a seperate bot to upload the queue items ( I don’t want to manually upload the file data through orchestrator)
  2. Now assuming queue is ready to process in 3 VM’s, how bot’s can update the status in the input excel which has the column “Status”

  • question is that, if the file is put in shared location, if 2 VM bots finished at a same time, how to update the status?

Conclusion of my requirement is that:

  1. Upload the excel data to queue (excel data doesn’t have any unique reference)
  2. Process the queue in multiple machines
  3. Update the status of record back in excel

How to proceed about it? Ultimately I have to use multiple VM’s because I have huge data in excel

Kindly share your thoughts. @Florent_Salendres

Regards,
Theertha K S

Well,

You can create two bots for uploading queue items and processing them (or enclose them in a single workflow whichever suits your project needs)

Multiple machines (Machines and robot provisioning must be done) and deploying bots can be scheduled.

You can update the status of the queue item and then download the data after allt he processing is carried out.

Regards :slight_smile:

I need one merged output from boys running in multiple machines. Is that possible?

Hi.

This is a loaded question, but I will try to make some useful comments from my own experience. Also, don’t take any of these suggestions as the “perfect” method, because I believe the Queue feature is still somewhat new and open for evolution over time.

Setting up project
Ultimately, you want a job to only run on a certain number of Robots, otherwise you would need to create an Environment to provide available Robots to other processes that need to run.

— So you can have a single Job start the other Robots using the Start Job activity. In order to do so, you will need some parameters first, like Process Name and Environment. I suggest that as part of the Initialization, you get the Process Name from the project.json, but also provide some flexibility that this can be set in a config file (this should probably be set to the config dictionary). Then, for Environment, I believe it is better to use this as an Argument and set it in Orchestrator on the Process when you provision it, rather than needing to change this in the config file each time the Environment changes (causes complications).

Note: in_ProcessEnvironment and in_NumberOfRobots are arguments set on the Process within Orchestrator, and by Default in Studio in_NumberOfRobots is 1, so it only applies to running in Orchestrator

The above image shows a condition where the number of robots determines if the job is started again, so each new job will keep starting another job until the number of robots value is reached. If it is > the number of robots, then it ends the process.

This would be placed after the items are dispatched toward the end of the Initialization phase.

Dispatching the queue:
I am a firm believer that this should not be a separate job, and instead be included as part of your project. The Dispatcher can reside in the Initialization phase after the settings are set. This should also be invoked during the part where you initialize your data set / TransactionData which will be processed (ie Read Range or creating an Array).
image
Notice I placed this before the multiple Robots are started, so the items will still be dispatched, even if it ends the process.


This shows that I invoke a Dispatcher at the end of setting my data set. I also use a Boolean to provide more flexibility, so if a developer chooses not to use the Queue, that option is available.

The dispatcher is the more challenging part, because your Transaction Item could be various data types, and that the ItemInformation for each data type could be different.


For these reasons, I am using a Switch so the ItemInformation can be added based on the data type.

At the beginning of this dispatcher, I am also getting a value to prioritize the items. That value would increment per item (from for example 0-9) and allow me to space out the items so certain things like files can be updated one Robot at a time. But, more on that later in this post.

Essentially, you only want the item dispatched if it doesn’t already exist.


This shows how I first set the desired reference value. In my case, I am using an array of column names, because this is for a DataRow item. Then, if the reference does not exist already, add the queue item. This may depend on if you need unique items, which is my recommendation so you don’t clutter the queue with many duplicated items.

Choosing a good Reference value:
Choosing a specific reference value for your item is important. For example, if you only use an invoice number, well then there could be multiple invoices with the same number but are actually different transactions. So, maybe an invoice date and customer id number should be added to the reference.

And, what if in the future, you want to process the items again for whatever reason? Then, you would want a current date or month value on the reference, depending on your requirements.

In other cases, specifically for multiple Adhoc users, you may want to include the username of the person who last saved an input file, so the transactions will be processed per user even if they have the same reference.

So, that was just something else I added to think about.

Outputting results from multiple robots:
Here lies the most difficult part, and one I have not completely solved just yet, but am close.

There have been various ideas presented throughout the forums.
For example,

— You could store an individual file per Robot, so they don’t fight over each other, then merge the files on the last transaction somehow. However, this method is not ideal mostly because a user can’t see the results in real-time and if the process fails, they won’t have any results.

— Or, you could try to do what I’m trying to do is use a RetryScope to check if the file is accessible then change it to Read-Only (this takes about .2-.3 of a second, so is fast). However, not fast enough evidently because Robots are hitting this part at the same time and getting through. This is why I have the ItemPriority value as mentioned earlier, so right after I placed a Delay using this value to space the Robots out. Lastly, before the file is updated, you will want to run that RetryScope a second time to catch the newly spaced out Robots and put them on hold until the file becomes accessible.

Here is that snippet:


image
Then, like I said a second Retry Scope before the file is updated, with only difference should be the Read-Only should be set to False at start, to prevent it from getting stuck if the Read-Only is on.

However, as mentioned, this is still not working, while stress testing, and resulting in the file not being accessible “occassionally” but not often. Note: I also have it where it saves the file to a copy if it is being used, so no errors occur.

So the trick is getting this part working properly. Maybe I’m missing something.


Anyway, those are my insights, and I most likely left parts off. I can’t provide any .xaml files since these files are internal to company at this time.

You can also check out my topic posted here for some additional thoughts:

Regards.

Clayton

5 Likes

I wanted to add that the first about about Starting the Job from the project until the number of robots were reached, was intended as a way to manage Adhoc/Manual triggers. You don’t necessarily need this part for Scheduled Jobs. But at the same time, it spaces out the job starts so they are less likely to fight over each other.

This is something I’ve thought about, but not actually done anything with yet, as we only have one production robot on our enterprise license, however…

Could you use SetTransactionStatus to update the output value of each queue item your process, and then have a Start Job in the “End Process” part of the ReFramework to start a workflow with a loop to iterate through the queue items, read the output (or exception) values, and add them to your spreadsheet?