I was asked this quetsion in an interview I want to know bets possibel answers

let’s say you have Process A and Process B, okay? Process A is processing some thousands of records in a day, okay? N number. It may be 500, it may be 1000, it may be 2000 in a day. So there is no fixed amount of records on that particular day. It will process and it will generate some kind of an output. And that output is also stored in the Excel file, in the same Excel file where these 1000 records were read. Let’s say, assume that there is a column called Output and the Process Output will be stored in that output. Now, after this, there is one more process called Process B, which has a cutoff time to process, to complete the second leg of the process on all these records. And the input for Process B is the output column that has been generated by Process A. The input for Process B is Process A’s output, the one which you had stored in the output column while doing when the Process A was running. so Process A runs, it reads some of the data, it will do some process, it will generate some output and pushes it into the output column. That is the end of Process A. Process B has a cutoff time to process all these records and it will read only one single column, which is the output column. No other columns are being read. And it has to process within the cutoff time. so this is the process requirements or the run requirements. Now it is in production. Now, user will come back after a while, some like two or three months of production monitoring. He will come and say that Process B is missing out some of the records within that cutoff time. So the records are not processed within the cutoff time. So now this will come to you and he will say that, okay, could you please optimize these processes and see that all the records are processed by Process B within the set cutoff time. So the question is, what is that you will look into the existing code and what you will identify or what is required, you know, that needs to be done so that Process B is optimized. How can we do this

Hey @gopu_rpa

Let’s break down the requirements:

Process A -

  1. Processes thousands (varied volume of input data) of records in a day (with Queue? - if not add the input to Queue_ProcessA to track and monitor each transaction.)
  2. Generate Output and store in the excel file and also add to the queue (why not add to Queue_ProcessB) - make it more real time processing and for Queue based trigger of Processes B.

Process B:

Queue Based trigger for Process B- Once the item is loaded into the Queue_ProcessB, the Process B will start and completes in real time - no dependency on Process A to complete

So, both the processes are decoupled and meeting the cut-off time (SLAs for each transaction)

We have also implemented like using shared drive and Queues (place the output file in shared drive and reading the file and add just new items to queue by enforcing uniqueness in the Queue)

Let me know if it helps or you have any questions.

Hi @gopu_rpa

You can use excel file and read range to read only the output column and avoid full‑sheet operations, move all heavy logic outside loops, prevent repeated excel open/close, and ensure process a closes the workbook cleanly without leaving locks.
Review process B for unnecessary lookups or delays and apply filter data table before looping to reduce the record volume. remove excess formatting to reduce excel load time, and when volumes grow, shift the data to data service or a database for stable throughput. you can also use linq for fast in‑memory filtering and transformation, and sql queries if the data is stored in a database to achieve the highest performance and meet the cutoff window.

Happy Automation

@gopu_rpa

one big bottle neck I see is Process A completing all records and then process B starts

instead change the model a little and add a queue trigger to process B and only output data is needed just add the data to queue as well at the end of process A for each transaction so the process B will satrt processing while process A is working on item 2 process B starts and processes item A..and this continues this way the time reduces

for this you need atleast two license

also other optimizations would be to check any static delays , any extra logics which can run in parallel etc

cheers

Eventually, this is not 1 process, but a process in two automated main steps, A and B. But from a result point of view it is still A + B = satisfied customer.

Eventually it is about capacity. The time for process A + the time for process B times the number of records is the total capacity time needed.

If part B of the process does not fit the window it is supposed to be executed in, there are a few possible causes:

  • A Takes too long, letting B start too late to finish
  • the total count of B is not possible at all to execute within the timeframe.

If you can start A sooner (I did nod read a time constraint here) you’d give B more time, helping on the 1st scenario.
If the second scenario is applicable, there is but one (global) solution: increase the capacity. Either by shortening the robot (can it be optimized to run smoother/faster etc), or by adding platform capacity, so in effect run B by multiple robots. The latter obviously being more expensive due to license consumption.

If nothing of the above is possible for you, the remainder would be to manage the expectation. You can calculate (by looking at time (A + B) * number of records) how many you can safely complete within the time window, and communicate that being the upper limit of your solution. Want more, pay more.