Bug #5343
closedmon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
0%
Description
A user on ceph-users shared a log containing a most interesting behavior happening on OSDMonitor::update_from_paxos(), during which the same incremental is applied over and over again. From his log, this is happening on a single-monitor setup.
The monitor starts nicely, creating the OSDMonitor's pending value for epoch 2810.
Eventually, MDSMonitor::fail_mds_gid() triggers a cross-proposal to the OSDMonitor. A full map for epoch 2809 is encoded, along with the 2810 incremental:
2013-06-13 11:36:42.114193 7f5894e25700 10 mon.a@0(leader).mds e136 preprocess_query mdsbeacon(45703/a up:boot seq 1 v0) v2 from mds.? 10.255.0.25:6800/10106 2013-06-13 11:36:42.114205 7f5894e25700 12 mon.a@0(leader).mds e136 preprocess_beacon mdsbeacon(45703/a up:boot seq 1 v0) v2 from mds.? 10.255.0.25:6800/10106 compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding} 2013-06-13 11:36:42.114219 7f5894e25700 7 mon.a@0(leader).mds e136 prepare_update mdsbeacon(45703/a up:boot seq 1 v0) v2 2013-06-13 11:36:42.114222 7f5894e25700 12 mon.a@0(leader).mds e136 prepare_beacon mdsbeacon(45703/a up:boot seq 1 v0) v2 from mds.? 10.255.0.25:6800/10106 2013-06-13 11:36:42.114229 7f5894e25700 10 mon.a@0(leader).mds e136 fail_mds_gid 45404 mds.a rank 0 2013-06-13 11:36:42.114234 7f5894e25700 10 mon.a@0(leader).osd e2809 blacklist 10.255.0.25:6800/11482 until 2013-06-13 12:00:42.114233 2013-06-13 11:36:42.114467 7f5894e25700 10 mon.a@0(leader).osd e2809 min_last_epoch_clean 2796 2013-06-13 11:36:42.114590 7f5894e25700 10 mon.a@0(leader).osd e2809 encode_full osdmap e 2809 2013-06-13 11:36:42.114734 7f5894e25700 10 mon.a@0(leader).osd e2809 min_last_epoch_clean 2796 2013-06-13 11:36:42.114822 7f5894e25700 10 mon.a@0(leader).osd e2809 encode_pending e 2810
Being on a single-monitor setup, this proposal is swiftly committed and OSDMonitor::update_from_paxos() is called. We are then caught in a loop that goes on forever:
2013-06-13 11:36:42.452660 7f5894e25700 15 mon.a@0(leader).osd e2809 update_from_paxos paxos e 2810, my e 2809 2013-06-13 11:36:42.452723 7f5894e25700 7 mon.a@0(leader).osd e2809 update_from_paxos applying incremental 2810 2013-06-13 11:36:42.452783 7f5894e25700 1 mon.a@0(leader).osd e2809 e2809: 9 osds: 9 up, 9 in 2013-06-13 11:36:42.452808 7f5894e25700 7 mon.a@0(leader).osd e2809 update_from_paxos applying incremental 2810 2013-06-13 11:36:42.452849 7f5894e25700 1 mon.a@0(leader).osd e2809 e2809: 9 osds: 9 up, 9 in 2013-06-13 11:36:42.452872 7f5894e25700 7 mon.a@0(leader).osd e2809 update_from_paxos applying incremental 2810 2013-06-13 11:36:42.452913 7f5894e25700 1 mon.a@0(leader).osd e2809 e2809: 9 osds: 9 up, 9 in
From the code we are able to deduce that this will only happen if the in-memory osdmap version is not increased when we apply the incremental. This also impairs the monitor from progressing. Despite having the on-disk osdmap:last_committed set to 2810, and despite constantly applying it to the in-memory map, the OSDMonitor's epoch never reaches epoch 2810, as made clear in the log. Restarting the monitor doesn't help either.
After being restarted, this is what happens:
2013-06-13 16:30:49.291580 7ffe4b2e3780 1 mon.a@-1(probing).osd e2809 e2809: 9 osds: 9 up, 9 in 2013-06-13 16:30:49.291602 7ffe4b2e3780 7 mon.a@-1(probing).osd e2809 update_from_paxos applying incremental 2810 [restart happens here] 2013-06-13 16:43:30.656501 7f544eae5780 0 ceph version 0.61.3 (92b1e398576d55df8e5888dd1a9545ed3fd99532), process ceph-mon, pid 19554 2013-06-13 16:43:30.848864 7f544eae5780 1 mon.a@-1(probing) e12 preinit fsid d459cc6e-61b4-41e5-8987-242e12fd760e 2013-06-13 16:43:30.875786 7f544eae5780 1 mon.a@-1(probing).osd e2809 e2809: 9 osds: 9 up, 9 in 2013-06-13 16:43:30.875858 7f544eae5780 1 mon.a@-1(probing).osd e2809 e2809: 9 osds: 9 up, 9 in [continues as such till the end of the log]