Bug #3633
closed
mon: clock drift errors not reported by ceph status
Added by Corin Langosch over 11 years ago.
Updated over 11 years ago.
Description
Using argonat 0.48.2. Today all ceph commands were randomly slow. So I checked all hosts, all monitors (3) and osds (17) were up and running. ceph status and ceph -w were not reporting any error. Digging a little deeper I found in the logs of one monitor that it complained about too high clock drift, which was caused by a crashed ntp server. After fixing this everything worked fine again. But I'd like to suggest to emit a warning when running ceph status or ceph -w in case of any clock drift errors. Returning HEALTH_OK is a bit misleading when in fact the cluster is not 100% working and randomly hangs for several seconds.
- Assignee set to Joao Eduardo Luis
- Priority changed from Normal to High
- Backport set to Bobtail
- Subject changed from clock drift errors not reported by ceph status to mon: clock drift errors not reported by ceph status
- Category set to Monitor
- Source changed from Development to Community (user)
- Status changed from New to In Progress
I'm looking into an adequate way to make 'ceph -s' return a warning when the clocks have drifted.
However, 'ceph -w' should have shown clock drifting warnings. Have you disabled 'clog_to_monitors'?
Here's my config: http://pastie.org/5554031
I'm pretty sure there was no warning when I did 'ceph w', because I was really puzzled at first why the cluster randomly hangs and checked quite a lot of things. But I cannot say for sure now, as I didn't save the output :(.
To monitor ceph's status I'm having a cronjob which does 'ceph health details | grep HEALTH_OK > /dev/null || ceph health details' every few minutes. So when ceph is not healthy I get an email alert. The result should not be HEALTH_OK if there's any warning/error (clock drift included).
'HEALTH_OK' and 'HEALTH_WARN' are assessed in a way that makes it non-trivial to leverage the existing way of doing things to consider the clock drifting messages. Still looking into a couple of options though.
Regarding the "ceph -w" not showing the warning messages, that can easily be explained by the fact that to the drift will be applied an exponential backoff. So you'd see those warnings for (say) the first couple of warning, and then the frequency of warning would decrease. This would make noticing the warning really difficult, no doubt, which makes the need to warn the user in some other way (e.g., on "ceph -s") really important.
- Status changed from In Progress to 4
wip-3633 now has a couple of patches that introduce a mechanism to keep track of clock skews on the monitors.
If severe, the clock skews will be reported on 'ceph health' and 'ceph status' with a HEALTH_WARN. 'ceph health detail' will also report which nodes are suffering from clock skews. With the latest patches, which are yet to be reviewed and to assess if they should go upstream, one will also be able to provide a '--format json' to both commands and obtain detailed information on skews regardless of being severe or not.
These patches also allow us to keep track of the latency between the monitors.
Reading the patch it looks only the clocks of the mons are checked. So the clocks of the osds are not important to ceph?
The objective here was to make sure that clock skews on the monitors were detected and reported, as said skews might affect the monitor's behavior.
Clocks are important as well for the osds. OSDs rely on clocks to, for instance, check if other osds failed. But that was fairly outside what we aimed with these patches.
I'll look into whether or not having the monitors reporting clock skews on other ceph components besides the monitors themselves would be something we want, and open a different issue if it turns out to be the case.
The OSD clocks are actually fairly unimportant. Everything they use that requires precise timing should be based entirely on local clocks (if there's evidence that is not the case, we have a bug!). If using authentication they do need to be sort of close to the monitors, as the auth keys rotate on an hourly basis (with a bit of overlap for previous, current, next keys).
- Priority changed from High to Normal
- Status changed from 4 to Resolved
Also available in: Atom
PDF