Bug #215
closedosd crash: FAILED assert(seq >= last_committed_seq)
0%
Description
this is ceph unstable c626ac384678661b765c1ae1dee8db48b2c70993
#0 0x00007f41b1a18a75 in raise () from /lib/libc.so.6 #1 0x00007f41b1a1c5c0 in abort () from /lib/libc.so.6 #2 0x00007f41b22cd8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #3 0x00007f41b22cbd16 in ?? () from /usr/lib/libstdc++.so.6 #4 0x00007f41b22cbd43 in std::terminate() () from /usr/lib/libstdc++.so.6 #5 0x00007f41b22cbe3e in __cxa_throw () from /usr/lib/libstdc++.so.6 #6 0x00000000005b39f8 in ceph::__ceph_assert_fail (assertion=0x5ec3b2 "seq >= last_committed_seq", file=<value optimized out>, line=711, func=<value optimized out>) at common/assert.cc:30 #7 0x00000000005649e1 in FileJournal::committed_thru (this=0x1116310, seq=0) at os/FileJournal.cc:711 #8 0x000000000055d265 in JournalingObjectStore::commit_finish (this=0x1125740) at os/JournalingObjectStore.cc:186 #9 0x00000000005543f3 in FileStore::sync_entry (this=0x1125740) at os/FileStore.cc:1714 #10 0x00000000004ef93d in FileStore::SyncThread::entry() () #11 0x0000000000469a4a in Thread::_entry_func (arg=0x6315) at ./common/Thread.h:39 #12 0x00007f41b28ab9ca in start_thread () from /lib/libpthread.so.0 #13 0x00007f41b1acb6cd in clone () from /lib/libc.so.6 #14 0x0000000000000000 in ?? ()
attached full log of that osd.
I kept all logs, core, ... just ask!
Files
Updated by Sage Weil almost 14 years ago
- Status changed from New to 7
- Target version set to v0.21
this should be fixed by bf3d52a4b725a0f2d3db39ea9ad5b412171ea0ad... can you please confirm?
thanks!
Updated by ar Fred almost 14 years ago
I got a crash after restarting that osd, same stacktrace (if you ignore line numbers difference due to your recent commit):
#0 0x00007fbc5be06a75 in raise () from /lib/libc.so.6 #1 0x00007fbc5be0a5c0 in abort () from /lib/libc.so.6 #2 0x00007fbc5c6bb8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #3 0x00007fbc5c6b9d16 in ?? () from /usr/lib/libstdc++.so.6 #4 0x00007fbc5c6b9d43 in std::terminate() () from /usr/lib/libstdc++.so.6 #5 0x00007fbc5c6b9e3e in __cxa_throw () from /usr/lib/libstdc++.so.6 #6 0x00000000005b3a08 in ceph::__ceph_assert_fail (assertion=0x5ec3d2 "seq >= last_committed_seq", file=<value optimized out>, line=711, func=<value optimized out>) at common/assert.cc:30 #7 0x00000000005649f1 in FileJournal::committed_thru (this=0x1def3c0, seq=0) at os/FileJournal.cc:711 #8 0x000000000055d265 in JournalingObjectStore::commit_finish (this=0x1df5740) at os/JournalingObjectStore.cc:189 #9 0x00000000005543f3 in FileStore::sync_entry (this=0x1df5740) at os/FileStore.cc:1714 #10 0x00000000004ef93d in FileStore::SyncThread::entry() () #11 0x0000000000469a4a in Thread::_entry_func (arg=0x6c0b) at ./common/Thread.h:39 #12 0x00007fbc5cc999ca in start_thread () from /lib/libpthread.so.0 #13 0x00007fbc5beb96cd in clone () from /lib/libc.so.6 #14 0x0000000000000000 in ?? ()
Should your patch fix the actual problem, or its cause? i.e., should I expect my osd to restart, or should I reformat?
just in case, on frame 7:
(gdb) p seq $1 = 0 (gdb) p last_committed_seq $2 = 1975
Updated by Sage Weil almost 14 years ago
Oh.. it may have written the bad (0) value to current/commit_op_seq. Can you confirm that file has 0 in it? If so, then you either need to re-mkfs, or probably
$ echo 1975 > current/commit_op_seq
will do the trick.
Updated by ar Fred almost 14 years ago
- File commit_op_seq commit_op_seq added
I'm attaching the commit_op_seq file, as the content is not what I was expecting, it indeed has a 0 in it, but it also holds a second line.
the echo trick worked, this osd has now joined the other and all the PGs are active+clean.
Thanks