Bug #8801
closedCeph monitors do not start after server restart
0%
Description
We have two separate Ceph installations with five servers each.
Sometimes when a server is restarted the Ceph monitor on it does not
start automatically. The monitor is different each time and it does not
happen always.
We've upgraded from Cuttlefish to Dumpling and later to Emperor and it
seems that the issue is not totally resolved. Currently we are using
Emperor.
We've tried several ways [1] [2] [3] to to bring the monitor back up.
Usually only recreating the monitor helped. [3]
In Emperor usually the monitor can be started/restarted manually with:
sudo initctl start ceph-mon cluster=ceph id=_ceph_server_hostname
OS: Ubuntu 12.04
Ceph version: cuttlefish, dumpling, emperor
Kernel version: 3.5.x
[1]
sudo initctl start ceph-mon cluster=cluster_name id=nowhere-cmp-05
[2]
sudo restart ceph-mon-all
sudo initctl restart ceph-mon cluster=cluster_name id=nowhere-cmp-05
[3]
sudo initctl stop ceph-mon cluster=cluster_name id=nowhere-cmp-04
sudo ceph mon remove nowhere-cmp-04
sudo mv /var/lib/ceph/mon/ceph-nowhere-cmp-04/ /var/lib/ceph/mon/ceph-nowhere-cmp-04.bak
sudo mkdir /var/lib/ceph/mon/ceph-nowhere-cmp-04
sudo ceph auth get mon. -o /tmp/auth
sudo ceph mon getmap -o /tmp/map
sudo ceph-mon -i nowhere-cmp-04 --mkfs --monmap /tmp/map --keyring /tmp/auth
sudo ceph mon add nowhere-cmp-04 10.16.0.107:6789
sudo ceph-mon -i nowhere-cmp-04 --public-addr 10.16.0.107:6789
Files
Updated by Joao Eduardo Luis almost 10 years ago
- Assignee set to Joao Eduardo Luis
Can you provide logs for the monitor that doesn't start? Ideally with 'debug mon = 10'.
Updated by AltScale Inc almost 10 years ago
- File ceph-mon-issues.log ceph-mon-issues.log added
We were able to reproduce the issue with the monitors by restarting the physical server. The Ceph configuration had debug set as stated in the documentation:
[mon]
mon debug = 10
A log of the monitor is applied. The machine was restarted at 14:24 and was operational at 14:25/26. The monitor didn't start by itself. There are no logs before it was started manually with:
sudo initctl start ceph-mon cluster=ceph id=vn-cmp-04
Restarting the monitor, just shuts it down:
sudo restart ceph-mon-all
Updated by Sage Weil almost 10 years ago
- Status changed from New to Can't reproduce
- Source changed from other to Community (user)
from teh logs the ceph-mon process was never started.. iw ould look in your /var/log/upstart logs?