Handling Rook-Ceph Clock Skew.
Issue Description: How to handle Rook-Ceph clock skew.
Root Cause: Clock skew can have a negative impact on rook-ceph and cause many issues. If there is clock skew on the cluster, this needs to be addressed.
Diagnosing/Resolving
- There typically will not be a clear symptom of the issue. However to check for clock skew, execute the following command on one of the cluster nodes.
- Accessing Automation Suite - Enable kubectl documentation
- kubectl -n rook-ceph exec -it deploy/rook-ceph -- ceph status
- In the output will be a message that skew is detected
- On the individual nodes, the Linux command date, will show the nodes time. Executing the command on the different nodes can be used to gauge if there is some skew
- Typically on RedHat the time service is chronyd. A few things can be done when this issue occurs:
- Check that the time service is running: systemctl status chronyd
- If it is not started, start it: systemctl restart chronyd
- Check the configuration to make sure a time service is configured: view /etc/chronyd
- Check that the time service is running: systemctl status chronyd
- As a workaround, the following can be done:
- Elect a node to have the 'correct' time. Lets call this masterNode
- On the other nodes, run the following command: date --set="$(ssh <user>@<IP of masterNode> 'date -u')"
- The above command, executed on all other nodes besides the masterNode, will set the time to that of the master node.
- The above is a workaround. When this issue is encountered, contact your system Administrator and request they fix the issue. This is part of general Linux administration.