Hello, submitted also via regular support. As usually, checking who is faster, support or community: Case # 01155757
After unexpected power down, the AI center pods fails to fully start. Event log shows issues with CEPH connectivity. Ceph status cannot be verified, the ceph command on rook-ceph-tools does not give any results (stuck with command fulfillment). Multiple pods report issues with mounting to pvc has been created.
kubectl describe pod -n kurl registry-6fffbb9895-26rnt
Events:
Type Reason Age From Message
Warning FailedMount 31m (x17 over 3h32m) kubelet Unable to attach or mount volumes: unmounted volumes=[registry-data], unattached volumes=[registry-pki registry-htpasswd default-token-td5bl registry-data registry-config]: timed out waiting for the condition
Warning FailedMount 19m (x18 over 171m) kubelet Unable to attach or mount volumes: unmounted volumes=[registry-data], unattached volumes=[registry-htpasswd default-token-td5bl registry-data registry-config registry-pki]: timed out waiting for the condition
Warning FailedMount 10m (x17 over 3h18m) kubelet Unable to attach or mount volumes: unmounted volumes=[registry-data], unattached volumes=[default-token-td5bl registry-data registry-config registry-pki registry-htpasswd]: timed out waiting for the condition
Warning FailedMount 5m7s (x103 over 3h32m) kubelet MountVolume.MountDevice failed for volume “pvc-b5548c8d-3da7-401a-b824-48d9547069b5” : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0009-rook-ceph-0000000000000002-3e9fcdea-f230-11eb-a611-8e83aec171b4 already exists