Project

General

Profile

Bug #13166

Updated by Greg Farnum over 8 years ago

ubuntu-2015-09-17_16:55:52-fs-greg-fs-testing---basic-multi/1061724

<pre>2015-09-17T17:43:43.959 INFO:tasks.ceph.mds.a.mira072.stderr:2015-09-17 17:43:43.946718 7fbbd146a700 -1 mds.0.journaler(ro) try_read_entry: decode error from _is_readable
2015-09-17T17:43:43.959 INFO:tasks.ceph.mds.a.mira072.stderr:2015-09-17 17:43:43.947729 7fbbd146a700 -1 log_channel(cluster) log [ERR] : Error loading MDS rank 0: (22) Invalid argument</pre>

Results of journal inspection:
<pre>Overall journal integrity: DAMAGED
Corrupt regions:
0x6b37fc-795ab1
</pre>

If we look at the MDS log, I see:
<pre>2015-09-17 17:40:57.876236 7fbfb374f700 17:42:40.643893 7f9f5f996700 5 mds.0.log _submit_thread 7026684~2786 : EUpdate unlink_local [metablob 100000000d9, 4 dirs]</pre>
But that's not followed by any flushes. On the standby:
<pre>2015-09-17 17:42:41.239035 7f8eb5b0b700
10 mds.b-s-a handle_mds_map: handling map as rank 0 mds.0.log replay start, from 6987500 to 7026684
...
2015-09-17 17:40:57.876251 7fbfb374f700 17:42:57.541965 7f8eb4907700 1 mds.0.0 handle_mds_map i am now mds.4109.0replaying mds.0.0 5 mds.0.log _submit_thread 7026684~5927 : EUpdate unlink_local [metablob 1, 7 dirs]
...
2015-09-17 17:40:57.876254 7fbfb374f700 17:42:57.542924 7f8eba314700 1 mds.0.0 handle_mds_map state change up:boot -- 10.214.132.17:6812/28582 --> up:standby-replay 10.214.134.104:6800/17380 -- osd_op(mds.0.0:180 200.00000001 [write 2832380~5947] 1.6e5f474 ondisk+write+known_if_redirected e9) v6 -- ?+0 0x7f8ec5ee9080 con 0x7f8ec547b5a0
2015-09-17 17:40:57.876261 7fbfb374f700 10 mds.beacon.b-s-a set_want_state: up:standby -> up:standby-replay ...
2015-09-17 17:40:57.876265 7fbfb374f700 17:42:57.544264 7f8eb3e03700 1 mds.0.0 replay_start</pre>

-- 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 117 ==== osd_op_reply(181 200.00000000 [writefull 0~90] v7'105 uv105 ondisk = 0) v6 ==== 179+0+0 (675457143 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0
</pre>
I can't immediately see any difference between this op and the log offset where the journal is apparently formatted properly (I checked the OSD logs briefly as well):
<pre>2015-09-17 17:41:23.463233 7fbfb374f700 17:43:19.332094 7f8eb4907700 5 mds.b-s-a handle_mds_map epoch 8 from mon.2 mds.0.log _submit_thread 7953073~1936 : EUpdate scatter_writebehind [metablob 100, 3 dirs]
...
2015-09-17 17:41:23.463265 7fbfb374f700 10 mds.b-s-a my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table} 17:43:19.332127 7f8eb4907700 1 -- 10.214.132.17:6812/28582 --> 10.214.134.104:6800/17380 -- osd_op(mds.0.0:526 200.00000001 [write 3758769~1956 [fadvise_dontneed]] 1.6e5f474 ondisk+write+known_if_redirected e9) v6 -- ?+0 0x7f8ec5a69600 con 0x7f8ec547b5a0
...
2015-09-17 17:41:23.463274 7fbfb374f700 10 mds.b-s-a 17:43:19.413072 7f8eb3e03700 mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag 1 -- 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 436 ==== osd_op_reply(526 200.00000001 [write 3758769~1956 [fadvise_dontneed]] v9'927 uv927 ondisk = 0) v6 ==== 179+0+0 (1812754033 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0</pre>

...and the answer appears to be that RADOS
is stored in omap,8=no anchor table} misbehaving:
2015-09-17 17:41:23.463280 7fbfb374f700 10 mds.b-s-a <pre>2015-09-17 17:43:19.332772 7f8eb3e03700 peer mds gid 4116 removed from map 1 -- 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 426 ==== osd_op_reply(525 200.00000001 [write 3756813~1956 [fadvise_dontneed]] v7'449 uv449 ondisk = 0) v6 ==== 179+0+0 (1646513375 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0
2015-09-17 17:41:23.463284 7fbfb374f700 17:43:19.335850 7f8eb3e03700 1 -- 10.214.132.17:6812/28040 mark_down 10.214.134.104:6812/17763 -- pipe dne 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 427 ==== osd_op_reply(527 200.00000001 [write 3760725~1956 [fadvise_dontneed]] v7'450 uv450 ondisk = 0) v6 ==== 179+0+0 (4283220460 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0
2015-09-17 17:41:23.463293 7fbfb374f700 10 mds.b-s-a map says i am 10.214.132.17:6812/28040 mds.0.2 state up:replay 17:43:19.336116 7f8eb3e03700 1 -- 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 428 ==== osd_op_reply(528 200.00000001 [write 3762681~1423 [fadvise_dontneed]] v7'451 uv451 ondisk = 0) v6 ==== 179+0+0 (1622616166 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0
2015-09-17 17:41:23.463300 7fbfb374f700 10 mds.b-s-a handle_mds_map: handling map as rank 17:43:19.336477 7f8eb3e03700 1 -- 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 429 ==== osd_op_reply(529 200.00000001 [write 3764104~1713 [fadvise_dontneed]] v7'452 uv452 ondisk = 0) v6 ==== 179+0+0 (3302607164 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0
2015-09-17 17:41:23.463303 7fbfb374f700 17:43:19.336819 7f8eb3e03700 1 mds.0.0 handle_mds_map i am now mds.0.0 -- 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 430 ==== osd_op_reply(530 200.00000001 [write 3765817~1223 [fadvise_dontneed]] v7'453 uv453 ondisk = 0) v6 ==== 179+0+0 (4109213907 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0
2015-09-17 17:41:23.463306 7fbfb374f700 17:43:19.413072 7f8eb3e03700 1 mds.0.0 handle_mds_map state change up:standby-replay --> up:replay -- 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 436 ==== osd_op_reply(526 200.00000001 [write 3758769~1956 [fadvise_dontneed]] v9'927 uv927 ondisk = 0) v6 ==== 179+0+0 (1812754033 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0
2015-09-17 17:41:23.463316 7fbfb374f700 10 mds.beacon.b-s-a set_want_state: up:standby-replay -> up:replay
2015-09-17 17:41:23.463324 7fbfb374f700 10 mds.0.0 Monitor activated us! Deactivating replay loop</pre>
17:43:19.417026 7f8eb3e03700 1 -- 10.214.132.17:6812/28582 <== osd.3 10.214.134.104:6800/17380 437 ==== osd_op_reply(551 200.00000001 [write 3767040~15631 [fadvise_dontneed]] v7'468 uv468 ondisk = 0) v6 ==== 179+0+0 (348239654 0 0) 0x7f8ec5f7eec0 con 0x7f8ec547b5a0</pre>

I think we must have broken this handling when splitting up MDS into Rank and Daemon. I've got Sam looking at it.

Back