Subtask #2615
Feature #2611: mon: Single-Paxos
mon: Single-Paxos: MDSMap::get_health() asserting
% Done:
0%
Source:
Development
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:
Description
MDSMap infos, dumped on MDSMap::get_health() just before the assert is triggered:
epoch 51 flags 0 created 2012-06-14 08:42:54.627948 modified 2012-06-19 16:22:40.450568 tableserver 0 root 0 session_timeout 60 session_autoclose 300 last_failure 0 last_failure_osd_epoch 16 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object} max_mds 3 in 0,1,2 up {0=4803,1=4397,2=4701} failed stopped data_pools [0] metadata_pool 1 mds_info.size() 2 5408: 127.0.0.1:6800/4834 'a' mds.-1.0 up:standby seq 7335 5414: 127.0.0.1:6801/5009 'b' mds.-1.0 up:standby seq 158
The assert:
mds/MDSMap.cc: In function 'void MDSMap::get_health(std::list<std::pair<health_status_t, std::basic_string<char> > >&, std::list<std::pair<health_status_t, std::basic_string<char> > >*) const' thread 7f9271482700 time 2012-06-19 09:09:56.399299 mds/MDSMap.cc: 254: FAILED assert(m != m_end) ceph version c36f301faf59ce560059a8039eaad58f083f53e (commit:8c36f301faf59ce560059a8039eaad58f083f53e) 1: (MDSMap::get_health(std::list<std::pair<health_status_t, std::string>, std::allocator<std::pair<health_status_t, std::string> > >&, std::list<std::pair<health_status_t, std::string>, std::allocator<std::pair<health_status_t, std::string> > >*) const+0x141f) [0x5912bf] 2: (Monitor::get_health(std::string&, ceph::buffer::list*)+0x76) [0x4818b6] 3: (Monitor::handle_command(MMonCommand*)+0x936) [0x4831d6] 4: (Monitor::_ms_dispatch(Message*)+0x106b) [0x49156b] 5: (Monitor::ms_dispatch(Message*)+0x32) [0x4a03c2] 6: (SimpleMessenger::dispatch_entry()+0x863) [0x5f6dc3] 7: (SimpleMessenger::DispatchThread::entry()+0xd) [0x5c900d] 8: (()+0x7e9a) [0x7f9276558e9a] 9: (clone()+0x6d) [0x7f9274f7e4bd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2012-06-19 09:09:56.400566 7f9271482700 -1 mds/MDSMap.cc: In function 'void MDSMap::get_health(std::list<std::pair<health_status_t, std::basic_string<char> > >&, std::list<std::pair<health_status_t, std::basic_string<char> > >*) const' thread 7f9271482700 time 2012-06-19 09:09:56.399299 mds/MDSMap.cc: 254: FAILED assert(m != m_end)
So, basically, the problem appears to be that the 'mds_info' map contains two MDSs, with gids 5408 and 5414, but the 'up' map knows of three MDSs and none of the gids in 'up' match those in 'mds_info'.
History
#1 Updated by Joao Eduardo Luis over 11 years ago
- Description updated (diff)
This issue stopped popping up after we changed the criteria to propose queued proposals and restarted testing with a fresh store.
My suspicion is that we didn't have a valid map by the time the MDSMap::get_health() was issued. If it happens to pop up again, this issue shall be updated.
#2 Updated by Joao Eduardo Luis over 11 years ago
- Status changed from In Progress to Closed