Bug #14256
closedmds: objecter assert on shutdown
0%
Description
http://pulpito.ceph.com/gregf-2015-12-21_23:08:59-fs-master---basic-smithi/1782/
Only saw this once so far and it might have a cause elsewhere, but I didn't see any similar reports so logging this for reference.
2015-12-22T21:54:33.134 INFO:tasks.ceph.mds.a-s.smithi015.stderr:osdc/Objecter.cc: In function 'void Objecter::shutdown()' thread e46e700 time 2015-12-23 00:54:33.102493 2015-12-22T21:54:33.134 INFO:tasks.ceph.mds.a-s.smithi015.stderr:osdc/Objecter.cc: 477: FAILED assert(tick_event == 0) 2015-12-22T21:54:33.156 INFO:tasks.ceph.mds.a-s.smithi015.stderr: ceph version 10.0.1-721-g09a3f69 (09a3f69e6f6a42d3a517b843d98706e8850edfac) 2015-12-22T21:54:33.156 INFO:tasks.ceph.mds.a-s.smithi015.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x6d2775] 2015-12-22T21:54:33.156 INFO:tasks.ceph.mds.a-s.smithi015.stderr: 2: (()+0x481d0a) [0x589d0a] 2015-12-22T21:54:33.157 INFO:tasks.ceph.mds.a-s.smithi015.stderr: 3: (MDSRankDispatcher::shutdown()+0x261) [0x3036e1] 2015-12-22T21:54:33.157 INFO:tasks.ceph.mds.a-s.smithi015.stderr: 4: (MDSDaemon::suicide()+0x22f) [0x2edb6f] 2015-12-22T21:54:33.157 INFO:tasks.ceph.mds.a-s.smithi015.stderr: 5: (MDSDaemon::handle_signal(int)+0x8b) [0x2edd0b] 2015-12-22T21:54:33.157 INFO:tasks.ceph.mds.a-s.smithi015.stderr: 6: (SignalHandler::entry()+0x127) [0x5de4d7] 2015-12-22T21:54:33.158 INFO:tasks.ceph.mds.a-s.smithi015.stderr: 7: (()+0x7df5) [0x539fdf5] 2015-12-22T21:54:33.158 INFO:tasks.ceph.mds.a-s.smithi015.stderr: 8: (clone()+0x6d) [0x66f01ad] 2015-12-22T21:54:33.158 INFO:tasks.ceph.mds.a-s.smithi015.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2015-12-22T21:54:33.190 INFO:tasks.ceph.mds.a-s.smithi015.stderr:2015-12-23 00:54:33.145574 e46e700 -1 osdc/Objecter.cc: In function 'void Objecter::shutdown()' thread e46e700 time 2015-12-23 00:54:33.102493
Files
Updated by John Spray over 8 years ago
Hmm, looks like Objecter is making the assumption that if tick_event is set then it must also be in ceph_timer::events. That's not the case, because if we happen to enter shutdown just as the event is getting called, it will have been removed from events (done before the callback), but tick_event will still be set (it's cleared or reset during the callback).
Introduced by:
commit ecf2bebe99b43735b930406cb1fedf51283a62f0 Author: Adam C. Emerson <aemerson@redhat.com> Date: Mon Sep 14 12:19:58 2015 -0400 time: Update OSDC for C++11 Time Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Updated by Adam Emerson over 8 years ago
- Category changed from 47 to 46
- Assignee set to Adam Emerson
Updated by Adam Emerson over 8 years ago
- File bugfix.patch bugfix.patch added
- Status changed from In Progress to 7
This patch should fix it. I'll run make check and push.
Updated by Adam Emerson over 8 years ago
- Status changed from 7 to 15
Pushed upstream as:
commit b308791290b48d1142eb8c222086ffe7509e3449 Author: Adam C. Emerson <aemerson@redhat.com> Date: Thu Jan 7 14:15:34 2016 -0500 osdc: Fix race condition with tick_event and shutdown - Clear the tick_event whether it was in the timer queue or not. - Make sure we don't schedule a new tick_event if someone calls shutdown while tick() is running - Make tick_event atomic so we can check it without a lock/while only holding a read lock Fixes #14256
Updated by John Spray over 8 years ago
Link to the pull request for convenience:
https://github.com/ceph/ceph/pull/7151
Updated by John Spray over 8 years ago
- Status changed from 15 to Fix Under Review
I don't know if we ever used "Pending upstream" before, usually when a PR is outstanding we use "Needs review"
Updated by Sage Weil over 8 years ago
- Status changed from Fix Under Review to Resolved
Updated by Greg Farnum almost 8 years ago
- Category changed from 46 to Correctness/Safety
- Component(FS) MDS, osdc added