Subtask #2615
Updated by Joao Eduardo Luis almost 12 years ago
MDSMap infos, dumped on MDSMap::get_health() just before the assert is triggered:
<pre>
epoch 51
flags 0
created 2012-06-14 08:42:54.627948
modified 2012-06-19 16:22:40.450568
tableserver 0
root 0
session_timeout 60
session_autoclose 300
last_failure 0
last_failure_osd_epoch 16
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object}
max_mds 3
in 0,1,2
up {0=4803,1=4397,2=4701}
failed
stopped
data_pools [0]
metadata_pool 1
mds_info.size() 2
5408: 127.0.0.1:6800/4834 'a' mds.-1.0 up:standby seq 7335
5414: 127.0.0.1:6801/5009 'b' mds.-1.0 up:standby seq 158
</pre>
The assert:
<pre>
mds/MDSMap.cc: In function 'void MDSMap::get_health(std::list<std::pair<health_status_t, std::basic_string<char> > >&, std::list<std::pair<health_status_t, std::basic_string<char> > >*) const' thread 7f9271482700 time 2012-06-19 09:09:56.399299
mds/MDSMap.cc: 254: FAILED assert(m != m_end)
ceph version c36f301faf59ce560059a8039eaad58f083f53e (commit:8c36f301faf59ce560059a8039eaad58f083f53e)
1: (MDSMap::get_health(std::list<std::pair<health_status_t, std::string>, std::allocator<std::pair<health_status_t, std::string> > >&, std::list<std::pair<health_status_t, std::string>, std::allocator<std::pair<health_status_t, std::string> > >*) const+0x141f) [0x5912bf]
2: (Monitor::get_health(std::string&, ceph::buffer::list*)+0x76) [0x4818b6]
3: (Monitor::handle_command(MMonCommand*)+0x936) [0x4831d6]
4: (Monitor::_ms_dispatch(Message*)+0x106b) [0x49156b]
5: (Monitor::ms_dispatch(Message*)+0x32) [0x4a03c2]
6: (SimpleMessenger::dispatch_entry()+0x863) [0x5f6dc3]
7: (SimpleMessenger::DispatchThread::entry()+0xd) [0x5c900d]
8: (()+0x7e9a) [0x7f9276558e9a]
9: (clone()+0x6d) [0x7f9274f7e4bd]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2012-06-19 09:09:56.400566 7f9271482700 -1 mds/MDSMap.cc: In function 'void MDSMap::get_health(std::list<std::pair<health_status_t, std::basic_string<char> > >&, std::list<std::pair<health_status_t, std::basic_string<char> > >*) const' thread 7f9271482700 time 2012-06-19 09:09:56.399299
mds/MDSMap.cc: 254: FAILED assert(m != m_end)
</pre>
So, basically, the problem appears to be that the 'mds_info' map contains two MDSs, with gids 5408 and 5414, but the 'up' map knows of three MDSs and none of the gids in 'up' match those in 'mds_info'.