Fix #15419: ceph-{mds,mon,osd,radosgw} systemd unit files need "wants=time-sync.target" - devops - Ceph

Actions

Copy link

Fix #15419

closed

ceph-{mds,mon,osd,radosgw} systemd unit files need "wants=time-sync.target"

Added by Nathan Cutler about 8 years ago. Updated almost 8 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Nathan Cutler

Category:

Target version:

% Done:

Source:

Community (dev)

Tags:

Backport:

jewel

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

It sometimes happens, when starting up an entire cluster at once, that a MON or OSD starts before ntp (or systemd-timesyncd or chrony) has a chance to synchronize the clock. When it happens to a MON, the cluster comes up in HEALTH_WARN due to clock skew. Joao added some code to the MON in #14175 to make the MON cluster recover from this quicker, but the quickest fix is to restart the offending MONs.

I have been spinning up clusters in Amazon Web Services (AWS) and I've found that this racing between the ntpd.service and the ceph services is not limited just to ceph-mon. If an OSD starts before the clock is synced, the cluster starts in HEALTH_WARN and all the PGs the offending OSD participates in get stuck in "Peering" state. This disappears when the OSD is restarted.

The suggested fix is to add:

Wants=time-sync.target
After=time-sync.target

to the ceph-{mds,mon,osd,radosgw} systemd unit files. This will ensure that the ntpd/chrony/systemd-timesyncd service is started before the respective Ceph daemon starts.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » devops

Custom queries

Fix #15419

ceph-{mds,mon,osd,radosgw} systemd unit files need "wants=time-sync.target"

Updated by Nathan Cutler about 8 years ago

Updated by Nathan Cutler about 8 years ago

Updated by Sage Weil about 8 years ago

Updated by Nathan Cutler about 8 years ago

Updated by Fabian Grünbichler almost 8 years ago

Updated by Nathan Cutler almost 8 years ago

Updated by Nathan Cutler almost 8 years ago

Updated by Fabian Grünbichler almost 8 years ago

Updated by Loïc Dachary almost 8 years ago