Is premature error handling really a problem?


#1

Hi all,

This entry is more in regards to methodology and how to approach a serious RPA automation project.
@badita shared this interesting forum post recently in which they discuss the problems with premature optimization in software development projects. Although similar in nature in the case of RPA development we don’t really have that much optimization we can do and usually that mostly imply optimizing the business process itself, but there is a parallel I would like to draw.

I think a similar problem we are facing in RPA development is premature error handling. I stumbled upon a lot of cases when micro error handling (as example for a certain click) caused more problems that it actually fixed and I think that people right now are focusing a bit too much on making 100% error prof workflows but in the end just wasting development time or causing more problems.

Don’t get me wrong error handling is important but are we going overboard with this?

Really would like to hear what you think.


#2

Generally, I don’t think you need to handle errors any lower than Worfklow (or Component) level. As you say you can end up with too much error handling which makes workflows confusing and can actually cause more problems than it solves.

What I would add is that on occassions you may wish to utilise a try catch for one specific activity if for some reason it only works intermittently. However, the retry scope activity may resolve a lot of these issues and would be preferable to a Try Catch.

The system exceptions are usually pretty good messages and within the limits of a workflow should be easy to resolve. However, it’s important here that workflows are not too big otherwise they become less helpful.


#3

Agreed. It all comes down to properly analysing the design - as long as the taken actions are the same, it can be handled by one TryCatch. There is not much point in catching an exception if all you do is rethrowing it higher without adding anything to it or handling it.

It might be that general perception of exceptions and errors is still that they’re bad and letting them bubble up is even worse (hint: it’s not, at least not in and of itself). There are even situations where you WANT the robot to crash/fail gracefully (think network outage - I still sometimes get a confused look after saying “if this happens, robot will do this, this and then crash”).
I’ve also seen situations where there was so much try catching that the root cause of the issue was a pain to find. It’s even more important with workflows, as heavy nesting of activities makes it really, really hard to read ( and with TC, you can see either Try or Catch or Finally which makes it even worse).

I’d say that TryCatch is “by default” recommended on:

  • Process step boundaries
  • Activities that are known to throw errors even though nothing is wrong (think OCR -> “Scrape returned empty text”)

Everything else is a case-by-case, usually because there is an identified (and that’s the key) need to handle this particular activity failing and do something with it in this particular point in the process. And that’s actually not that often as one might think.

It might also be that the robots are still developed in a waterfall model, where you try to anticipate everything that could go wrong within the processing and design to that. I’m finding iteration based design much more efficient, especially since it usually doesn’t take long to do each part and run it through for testing.

It would be even better if there would be a Compensate part in the Retry (as in actions that will revert the state of the application so that another try is possible).

Sidenote:
A thing that gets much less spotlight is that with each retry, dev is extending the time boundaries one transaction can take. That’s one thing that actually makes me still use standard looping sometimes instead of Retry - I want to clearly know which try works.
A real life example: a form often fails to open in the target app - we added retrying (3 tries). After analysing the work afterwards, it has been seen that the last retry never worked (literal never), but was still consuming time. With using Retry, this decision would not be possible (or at least much harder to find) and we would still be wasting time on something that doesn’t give any benefit.