Actions
Bug #12253
closedSometimes mds dump_ops_in_flight will crash mds
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Sometimes there are no suitable ops in mds, then if we run dump_ops_in_flight, mds will crash due to assertion failure. See below core dumps.
#10 0x00000000008c755a in ceph::__ceph_assert_fail (assertion=0x53c46b0 "\260\306\356\004", file=<value optimized out>, line=77856768, func=0xaf4ca0 "virtual void MDRequestImpl::_dump(utime_t, ceph::Formatter*) const") at common/assert.cc:77 tss = <incomplete type> buf = "mds/Mutation.cc: In function 'virtual void MDRequestImpl::_dump(utime_t, ceph::Formatter*) const' thread 7f2692a40700 time 2015-07-02 18:35:49.820326\nmds/Mutation.cc: 352: FAILED assert(internal_op !="... bt = 0x4af16c0 oss = <incomplete type> #11 0x0000000000650d51 in MDRequestImpl::_dump (this=0x616cb00, now=<value optimized out>, f=0x4af2d80) at mds/Mutation.cc:352 __PRETTY_FUNCTION__ = "virtual void MDRequestImpl::_dump(utime_t, ceph::Formatter*) const" #12 0x000000000081c21b in TrackedOp::dump (this=0x616cd58, now=..., f=0x4af2d80) at common/TrackedOp.cc:331 name = <incomplete type> #13 0x000000000081c63f in OpTracker::dump_ops_in_flight (this=0x4b00590, f=0x4af2d80) at common/TrackedOp.cc:109 p = {cur = 0x616cd60} ---Type <return> to continue, or q <return> to quit--- sdata = 0x4a502a0 locker = {mutex = @0x4a502a0} i = <value optimized out> total_ops_in_flight = <value optimized out> now = {tv = {tv_sec = 1435833349, tv_nsec = 820179660}} __PRETTY_FUNCTION__ = "void OpTracker::dump_ops_in_flight(ceph::Formatter*)" #14 0x00000000005b87f3 in MDS::asok_command (this=0x4b00000, command="dump_ops_in_flight", cmdmap=std::map with 1 elements = {...}, format="json-pretty", ss=...) at mds/MDS.cc:245 f = 0x4af2d80 __func__ = "asok_command" __PRETTY_FUNCTION__ = "bool MDS::asok_command(std::string, cmdmap_t&, std::string, std::ostream&)" #15 0x00000000005d3ea0 in MDSSocketHook::call (this=0x49f01b0, command="dump_ops_in_flight", cmdmap=std::map with 1 elements = {...}, format=<value optimized out>, out=...) at mds/MDS.cc:213 ss = <incomplete type> r = <value optimized out> #16 0x00000000008b5bfa in AdminSocket::do_accept (this=0x4a80000) at common/admin_socket.cc:362 args = "" success = <value optimized out> len = <value optimized out> ret = <value optimized out> cmdvec = std::vector of length 1, capacity 1 = {"{\"prefix\": \"dump_ops_in_flight\"}"} errss = <incomplete type> match = "dump_ops_in_flight" format = "json-pretty" p = {first = "dump_ops_in_flight", second = } out = {_buffers = empty std::list, _len = 0, _memcopy_count = 0, append_buffer = {_raw = 0x0, _off = 0, _len = 0}, last_p = {bl = 0x7f2692a3f490, ls = 0x7f2692a3f490, off = 0, p = {_raw = , _off = 0, _len = 0}, p_off = 0}, static CLAIM_DEFAULT = 0, static CLAIM_ALLOW_NONSHAREABLE = 1} connection_fd = 21 c = "dump_ops_in_flight" cmdmap = std::map with 1 elements = { ["prefix"] = {which_ = 0, storage_ = {<boost::detail::aligned_storage::aligned_storage_imp<24ul, 8ul>> = {data_ = { buf = "\250B\263\n\000\000\000\000\200\005\237\004\000\000\000\000\000\026\f\006\000\000\000", align_ = {<No data fields>}}}, static size = <optimized out>, static alignment = <optimized out>}} } address = {sun_family = 1, sun_path = "\000\000\000\000\000\000\003", '\000' <repeats 15 times>"\260, \317q\226&\177\000\000\005", '\000' <repeats 23 times>, "0v\274\224&\177\000\000n\022\277\000\000\000\000\000\000\000\250\004\000\000\000\000\300\374\243\222&\177\000\000\220\244\250\004\000\000\000\000\000\000\244\004\000\000\000\000u6r\226&\177"} address_length = 2 cmd = "{\"prefix\": \"dump_ops_in_flight\"}\000ons\"}\000\000\260\367\243\222&\177\000\000\022\000\000\000\000\000\000\000\070~\221\226&\177\000\000\064\000\206\356\000\000\000\000\320\367\243\222&\177\000\000\022\000\000\000\000\000\000\000\070~\221\226&\177\000\000\346\037\250\201\000\000\000\000\207\213q\226&\177\000\000\000\000\000\000\000\000\000\000\177\240\006\002\000\000\000\000&\000\000\000&\177\000\000\310\064J\226&\177\000\000\000\000\000\000\000\000\000\000P\371\243\222&\177\000\000\230\067J\226&\177\000\000\030xJ\226&\177\000\000\000\000\000\000\000\000\000\000`\326\222\226&\177\000\000`\326\222\226&\177", '\000' <repeats 18 times>, "`\326\222\226&\177\000\000`\326\222\226&\177", '\000' <repeats 18 times>, "`\326\222\226&\177\000\000\327\021K\226&\177"... ---Type <return> to continue, or q <return> to quit--- pos = <value optimized out> rval = false #17 0x00000000008b6e20 in AdminSocket::entry (this=0x4a80000) at common/admin_socket.cc:252 fds = {{fd = 7, events = 129, revents = 1}, {fd = 5, events = 129, revents = 0}} ret = <value optimized out> #18 0x00007f2695d157f1 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #19 0x00007f2694ca3ccd in clone () from /lib64/libc.so.6 No symbol table info available.
It is not necessary to do such strict checking in admin socket dump command. It is better to return some hints rather than crashing mds.
Actions