Troubleshooting AS 23.4 Cluster After Ungraceful Shutdown

Troubleshooting AS 23.4 Cluster after Ungraceful Shutdown.

Scenario: An Automation Suite 23.4 cluster had an ungraceful shutdown.

Symptoms:

  1. Pods and Applications giving Helm Pull Errors with Images in registry. not found.

Excerpt below:

“rpc error: code = Unknown desc = ‘helm pull oci:///registry……/helm/alerts/version 2023.4.2 –destination/tmp/…’ failed exit status 1: Error: registry…… not found

See attached sample screenshots:

Registry. is still resolvable at this point.

  • Re-run the registry upload command used in new installations:

./configureUiPathAS.sh registry upload --offline-bundle /uipath/tmp/as.tar.gz --offline-tmp-folder /uipath/tmp

  1. Redis Secret not found

The second issue encountered post-restart was related to Redis.

  • Multiple failed pods across different applications had identical events which were a combination of being unable to attach/ mount volumes, leading to timeouts; and the redb-redis-cluster-db secret was also not found.

  • Delete Redis and resync-ing Redis via ArgoCD, this is a possible problem which has been documented (Redis Probe Failure). The deletion commands are,

kubectl delete redb -n redis-system redis-cluster-db --force --grace-period=0 &

kubectl delete rec -n redis-system redis-cluster --force --grace-period=0 &

kubectl patch redb -n redis-system redis-cluster-db --type=json -p '[{"op":"remove","path":"/metadata/finalizers","value":"finalizer.redisenterprisedatabases.app.redislabs.com"}]'

kubectl patch rec redis-cluster -n redis-system --type=json -p '[{"op":"remove","path":"/metadata/finalizers","value":"redbfinalizer.redisenterpriseclusters.app.redislabs.com"}]'

kubectl delete job redis-cluster-db-job -n redis-system