Issue investigation in Orchestrator

mschuurman · January 27, 2021, 8:15am

TL;DR for issue investigation the following Orchestrator improvements would be awesome:

Redesigned jobs and logs overview screens with less clutter and more meaningful information
Filter jobs and logs on meaningful time window (e.g. yesterday between 10:0 and 10:30)
Job search to also consider the relevant exception and log information and for example summarize it in the results
Show the actually times on the jobs screen instead of “a day ago”
Remember how many results the user wants to see per screen type (e.g. 25 jobs and 100 logs)
Have a “job” screen that shows additional information about the job like: logs, exception messages, related queue items, related assets
Link back from a log item found via the logs screen to the originating job

When investigating issues with bot processes the Orchestrator is a vital part as it contains a lot of information that would let you figure out:

What the actual problem was: i.e. the exception in both the jobs and queues
What happened up until that moment: the job logs and queue item status
The context in which it happened: queue item data, assets

However getting to these pieces of information is very inefficient making the issue investigation a lot more cumbersome than it needs to be.

When an issue occurs we often either get an email from the business or the robot on the back of an exception. This contains information like:

The process that failed
The time it occurred
The exception message
A screenshot of the current state
Some unique identifier of the item it was processing

So first point of call would be to go to the jobs overview and find the issue. But when there are multiple issues getting to the right one means hover over the individual jobs’ start/end time to find the one closest to when the issue occurred of clicking the info button of each job, scrolling up and see whether that matches your exception message. It would be a lot easier if you could see the actual time (instead of “2 hours ago”) and the exception message as well as being able to search on information in the exception message. What would be really neat if the search would also go through the related logs and show jobs that match. The unique identifier and the (unique) screenshot filename are in our case at least in the logs so perfect to search for.

You could of course also use the “logs” to search for those directly but in that case you loose the “relation” to the job so you get far too much totally unrelated logs information. Assuming you did find the relevant job and are looking through the related logs the amount of log lines (10) is useless as you’re looking through all the steps and not just the last couple (and yes you can change it to 50 but even if that’s enough you need to do it every time). Also searching doesn’t help because that’ll give you 1 line and again you miss what happened around that.

Additionally figuring out which queue items and/or assets that were use for that particular job is highly dependant on the job and often requires digging through the logs. So finding the “data” that lead up to the issue is also not as easy as it could be.

So while these screens provide useful information it would be nice to have some further improvements that consider how you would use them (at least from an investigation point of view) rather than what data could be displayed.

Thanks!

Nic_Surpatanu · January 27, 2021, 3:22pm

Thank you @mschuurman for the detailed feedback! We deeply appreciate the diligence and effort you put to help us improve the product! The product and design teams will take it in and work on addressing it. FYI @iamwiliamb

Topic		Replies	Views
Debugging with the orchestrator is frustrating Orchestrator feedback	2	1597	August 20, 2021
Improvements in Orchestrator UI Help orchestrator	0	913	March 6, 2019
Trouble shooting orchestrator Help orchestrator , question	2	744	November 24, 2019
Orchestrator - We reorganized the job details page Product News orchestrator	17	985	January 30, 2024
Log view for queue items Orchestrator orchestrator , considering	3	1706	April 20, 2023

Most Active Users - Yesterday
Anil_G
ashokkarale
jinal.shah
Gautham_Pattabiraman
postwick
chandreshsinh.jadeja
vrdabberu
Ajay_Mishra
sven.wullum1
Vyshnavi_Nalumachu
More details...

Issue investigation in Orchestrator

Related Topics