Bug #367
closedOSD crash: CrushWrapper::do_rule
0%
Description
I tried to create a crushrule where a pool would run on just one OSD, this for some performance testing.
Loading the crushmap went fine, but when creating a pool with that crush rule resulted in 5 of the 12 OSD's crashing.
The crushmap:
domain kvmimg { id -2 # do not change unnecessarily alg straw hash 0 # rjenkins1 item device0 weight 1.000 } .. .. .. rule kvm { ruleset 4 type replicated min_size 1 max_size 10 step take kvmimg step choose firstn 0 type device step emit }
On 4 OSD's the backtrace was almost the same:
Core was generated by `/usr/bin/cosd -i 11 -c /etc/ceph/ceph.conf'. Program terminated with signal 6, Aborted. #0 0x00007f5edf7e1a75 in raise () from /lib/libc.so.6 (gdb) bt #0 0x00007f5edf7e1a75 in raise () from /lib/libc.so.6 #1 0x00007f5edf7e55c0 in abort () from /lib/libc.so.6 #2 0x00007f5edf7da941 in __assert_fail () from /lib/libc.so.6 #3 0x0000000000573b5c in crush_do_rule () #4 0x000000000050fbf5 in CrushWrapper::do_rule(int, int, std::vector<int, std::allocator<int> >&, int, int, std::vector<unsigned int, std::allocator<unsigned int> >&) () #5 0x0000000000512eab in OSDMap::pg_to_osds(pg_t, std::vector<int, std::allocator<int> >&) () #6 0x00000000004df562 in OSD::advance_map(ObjectStore::Transaction&) () #7 0x00000000004e3836 in OSD::handle_osd_map(MOSDMap*) () #8 0x00000000004ef978 in OSD::_dispatch(Message*) () #9 0x00000000004f03f9 in OSD::ms_dispatch(Message*) () #10 0x0000000000462449 in SimpleMessenger::dispatch_entry() () #11 0x000000000045930c in SimpleMessenger::DispatchThread::entry() () #12 0x000000000046d30a in Thread::_entry_func(void*) () #13 0x00007f5ee08dc9ca in start_thread () from /lib/libpthread.so.0 #14 0x00007f5edf8946fd in clone () from /lib/libc.so.6 #15 0x0000000000000000 in ?? () (gdb)
While one OSD had the following backtrace:
Core was generated by `/usr/bin/cosd -i 7 -c /etc/ceph/ceph.conf'. Program terminated with signal 6, Aborted. #0 0x00007fbcd7411a75 in raise () from /lib/libc.so.6 (gdb) bt #0 0x00007fbcd7411a75 in raise () from /lib/libc.so.6 #1 0x00007fbcd74155c0 in abort () from /lib/libc.so.6 #2 0x00007fbcd7cc68e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #3 0x00007fbcd7cc4d16 in ?? () from /usr/lib/libstdc++.so.6 #4 0x00007fbcd7cc4d43 in std::terminate() () from /usr/lib/libstdc++.so.6 #5 0x00007fbcd7cc4e3e in __cxa_throw () from /usr/lib/libstdc++.so.6 #6 0x0000000000544a79 in decode(std::string&, ceph::buffer::list::iterator&) () #7 0x00000000005311f8 in PG::read_log(ObjectStore*) () #8 0x00000000005331d6 in PG::read_state(ObjectStore*) () #9 0x00000000004e6675 in OSD::load_pgs() () #10 0x00000000004e7058 in OSD::init() () #11 0x00000000004580e2 in main () (gdb)
When going through the current/meta data i found that the OSD maps all contain this wrong crushrule, which keeps crashing the OSD's. I'm going to try to inject all the OSD maps (on all the OSD's, since my current OSD's are behind on the epoch) and then start it again.
But like said before ( #228 ) a new crushmap should not crash the cluster.
I've uploaded all the core files and logfiles to logger.ceph.widodh.nl in the directory /srv/ceph/issues/osd_crash_19_aug
Updated by Wido den Hollander over 13 years ago
I tried to fix the crushmaps on all the OSD's with:
find /srv/ceph/*/current/meta -type f -name 'osdmap*' -ctime -2 -exec osdmaptool --import-crush /root/crushmap {} \;
This worked, the initial crash went away, but changed to:
Core was generated by `/usr/bin/cosd -i 11 -c /etc/ceph/ceph.conf'. Program terminated with signal 6, Aborted. #0 0x00007f76e7bf8a75 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. in ../nptl/sysdeps/unix/sysv/linux/raise.c (gdb) bt #0 0x00007f76e7bf8a75 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f76e7bfc5c0 in *__GI_abort () at abort.c:92 #2 0x00007f76e84ae8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #3 0x00007f76e84acd16 in ?? () from /usr/lib/libstdc++.so.6 #4 0x00007f76e84acd43 in std::terminate() () from /usr/lib/libstdc++.so.6 #5 0x00007f76e84ace3e in __cxa_throw () from /usr/lib/libstdc++.so.6 #6 0x00000000005c0308 in ceph::__ceph_assert_fail (assertion=0x5ee8b9 "inc.epoch == epoch+1", file=<value optimized out>, line=496, func=<value optimized out>) at common/assert.cc:30 #7 0x0000000000518dee in OSDMap::apply_incremental(OSDMap::Incremental&) () #8 0x00000000004e32f9 in OSD::handle_osd_map (this=0x19cc000, m=<value optimized out>) at osd/OSD.cc:2373 #9 0x00000000004ef978 in OSD::_dispatch (this=0x19cc000, m=0x334e800) at osd/OSD.cc:2022 #10 0x00000000004f03f9 in OSD::ms_dispatch (this=0x19cc000, m=0x334e800) at osd/OSD.cc:1902 #11 0x0000000000462449 in Messenger::ms_deliver_dispatch (this=0x19ca000) at msg/Messenger.h:97 #12 SimpleMessenger::dispatch_entry (this=0x19ca000) at msg/SimpleMessenger.cc:342 #13 0x000000000045930c in SimpleMessenger::DispatchThread::entry (this=0x19ca488) at msg/SimpleMessenger.h:540 #14 0x000000000046d30a in Thread::_entry_func (arg=0x781f) at ./common/Thread.h:39 #15 0x00007f76e8cf39ca in start_thread (arg=<value optimized out>) at pthread_create.c:300 #16 0x00007f76e7cab6fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #17 0x0000000000000000 in ?? () (gdb)
The log shows:
10.08.20_13:02:50.979736 7f76e0f88710 filestore(/srv/ceph/osd.11) _flush_op_queue waiting for apply finisher 10.08.20_13:02:50.979758 7f76e0f88710 filestore(/srv/ceph/osd.11) flush complete 10.08.20_13:02:50.979771 7f76e0f88710 osd11 6585 cur 6579 < newest 6985 10.08.20_13:02:50.979823 7f76e0f88710 filestore(/srv/ceph/osd.11) stat /srv/ceph/osd.11/current/meta/inc_osdmap.6580_0 = 0 10.08.20_13:02:50.979837 7f76e0f88710 osd11 6585 handle_osd_map decoding inc map epoch 6580 10.08.20_13:02:50.979849 7f76e0f88710 osd11 6585 using my locally stored inc map 10.08.20_13:02:50.979876 7f76e0f88710 filestore(/srv/ceph/osd.11) read /srv/ceph/osd.11/current/meta/inc_osdmap.6580_0 0~0 10.08.20_13:02:50.979905 7f76e0f88710 filestore(/srv/ceph/osd.11) read /srv/ceph/osd.11/current/meta/inc_osdmap.6580_0 0~931 = 931 osd/OSDMap.h: In function 'int OSDMap::apply_incremental(OSDMap::Incremental&)': osd/OSDMap.h:496: FAILED assert(inc.epoch == epoch+1) 1: (OSD::handle_osd_map(MOSDMap*)+0x18f9) [0x4e32f9] 2: (OSD::_dispatch(Message*)+0x258) [0x4ef978] 3: (OSD::ms_dispatch(Message*)+0x39) [0x4f03f9] 4: (SimpleMessenger::dispatch_entry()+0x749) [0x462449] 5: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45930c] 6: (Thread::_entry_func(void*)+0xa) [0x46d30a] 7: (()+0x69ca) [0x7f76e8cf39ca] 8: (clone()+0x6d) [0x7f76e7cab6fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. root@node12:~#
I tried to figure out what kind of file inc_osdmap.6580_0 is. It seems like a diff between two OSD maps, but i can't figure out how to modify them.
Updated by Sage Weil over 13 years ago
Wido den Hollander wrote:
I tried to fix the crushmaps on all the OSD's with:
This won't really work.. even if you didn't manage to rewrite history on the osds like this, you'd probably hit some other assertion because the state doesn't match the history.
The osdmap* objects are osdmaps (see mondata/osdmap_full/*), and the _inc ones are the incremental/diffs (see mondata/osdmap/*). You'll need to restore those to match the monitor again...
Updated by Sage Weil over 13 years ago
Found the problem, though: 6589 is the epoch the kvm pool was added:
root@logger:~# ceph osd getmap 6589 -o om 10.08.20_18:33:00.227641 mon <- [osd,getmap,6589] 10.08.20_18:33:00.228389 mon1 -> 'got osdmap epoch 6589' (0) 10.08.20_18:33:00.228917 wrote 7031 byte payload to om root@logger:~# osdmaptool om --test-map-pg 0.1 osdmaptool: osdmap file 'om' parsed '0.1' -> 0.1 0.1 raw [9,11] up [] acting [] root@logger:~# osdmaptool om --test-map-pg 0.1p0 osdmaptool: osdmap file 'om' parsed '0.1p0' -> 0.1p0 osdmaptool: crush/mapper.c:520: crush_do_rule: Assertion `!(force_context[force_pos] != w[0])' failed. Aborted
Looking at it now
Updated by Sage Weil over 13 years ago
Ok, a workaround is pushed to the testing branch, ec80b10f28ac887e82c3bf4a136dfaa6e6bab65e. Merge that in, restore the original versions of the osdmaps to the osds, and you should be all set!