Project

General

Profile

Actions

Bug #12253

closed

Sometimes mds dump_ops_in_flight will crash mds

Added by Zhi Zhang almost 9 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Sometimes there are no suitable ops in mds, then if we run dump_ops_in_flight, mds will crash due to assertion failure. See below core dumps.

#10 0x00000000008c755a in ceph::__ceph_assert_fail (assertion=0x53c46b0 "\260\306\356\004", file=<value optimized out>, line=77856768, 
    func=0xaf4ca0 "virtual void MDRequestImpl::_dump(utime_t, ceph::Formatter*) const") at common/assert.cc:77
        tss = <incomplete type>
        buf = "mds/Mutation.cc: In function 'virtual void MDRequestImpl::_dump(utime_t, ceph::Formatter*) const' thread 7f2692a40700 time 2015-07-02 18:35:49.820326\nmds/Mutation.cc: 352: FAILED assert(internal_op !="...
        bt = 0x4af16c0
        oss = <incomplete type>
#11 0x0000000000650d51 in MDRequestImpl::_dump (this=0x616cb00, now=<value optimized out>, f=0x4af2d80) at mds/Mutation.cc:352
        __PRETTY_FUNCTION__ = "virtual void MDRequestImpl::_dump(utime_t, ceph::Formatter*) const" 
#12 0x000000000081c21b in TrackedOp::dump (this=0x616cd58, now=..., f=0x4af2d80) at common/TrackedOp.cc:331
        name = <incomplete type>
#13 0x000000000081c63f in OpTracker::dump_ops_in_flight (this=0x4b00590, f=0x4af2d80) at common/TrackedOp.cc:109
        p = {cur = 0x616cd60}
---Type <return> to continue, or q <return> to quit---
        sdata = 0x4a502a0
        locker = {mutex = @0x4a502a0}
        i = <value optimized out>
        total_ops_in_flight = <value optimized out>
        now = {tv = {tv_sec = 1435833349, tv_nsec = 820179660}}
        __PRETTY_FUNCTION__ = "void OpTracker::dump_ops_in_flight(ceph::Formatter*)" 
#14 0x00000000005b87f3 in MDS::asok_command (this=0x4b00000, command="dump_ops_in_flight", cmdmap=std::map with 1 elements = {...}, format="json-pretty", ss=...) at mds/MDS.cc:245
        f = 0x4af2d80
        __func__ = "asok_command" 
        __PRETTY_FUNCTION__ = "bool MDS::asok_command(std::string, cmdmap_t&, std::string, std::ostream&)" 
#15 0x00000000005d3ea0 in MDSSocketHook::call (this=0x49f01b0, command="dump_ops_in_flight", cmdmap=std::map with 1 elements = {...}, format=<value optimized out>, out=...)
    at mds/MDS.cc:213
        ss = <incomplete type>
        r = <value optimized out>
#16 0x00000000008b5bfa in AdminSocket::do_accept (this=0x4a80000) at common/admin_socket.cc:362
        args = "" 
        success = <value optimized out>
        len = <value optimized out>
        ret = <value optimized out>
        cmdvec = std::vector of length 1, capacity 1 = {"{\"prefix\": \"dump_ops_in_flight\"}"}
        errss = <incomplete type>
        match = "dump_ops_in_flight" 
        format = "json-pretty" 
        p = {first = "dump_ops_in_flight", second = }
        out = {_buffers = empty std::list, _len = 0, _memcopy_count = 0, append_buffer = {_raw = 0x0, _off = 0, _len = 0}, last_p = {bl = 0x7f2692a3f490, ls = 0x7f2692a3f490, off = 0, p = 
    {_raw = , _off = 0, _len = 0}, p_off = 0}, static CLAIM_DEFAULT = 0, static CLAIM_ALLOW_NONSHAREABLE = 1}
        connection_fd = 21
        c = "dump_ops_in_flight" 
        cmdmap = std::map with 1 elements = {
          ["prefix"] = {which_ = 0, storage_ = {<boost::detail::aligned_storage::aligned_storage_imp<24ul, 8ul>> = {data_ = {
                  buf = "\250B\263\n\000\000\000\000\200\005\237\004\000\000\000\000\000\026\f\006\000\000\000", align_ = {<No data fields>}}}, static size = <optimized out>, 
              static alignment = <optimized out>}}
        }
        address = {sun_family = 1, 
          sun_path = "\000\000\000\000\000\000\003", '\000' <repeats 15 times>"\260, \317q\226&\177\000\000\005", '\000' <repeats 23 times>, "0v\274\224&\177\000\000n\022\277\000\000\000\000\000\000\000\250\004\000\000\000\000\300\374\243\222&\177\000\000\220\244\250\004\000\000\000\000\000\000\244\004\000\000\000\000u6r\226&\177"}
        address_length = 2
        cmd = "{\"prefix\": \"dump_ops_in_flight\"}\000ons\"}\000\000\260\367\243\222&\177\000\000\022\000\000\000\000\000\000\000\070~\221\226&\177\000\000\064\000\206\356\000\000\000\000\320\367\243\222&\177\000\000\022\000\000\000\000\000\000\000\070~\221\226&\177\000\000\346\037\250\201\000\000\000\000\207\213q\226&\177\000\000\000\000\000\000\000\000\000\000\177\240\006\002\000\000\000\000&\000\000\000&\177\000\000\310\064J\226&\177\000\000\000\000\000\000\000\000\000\000P\371\243\222&\177\000\000\230\067J\226&\177\000\000\030xJ\226&\177\000\000\000\000\000\000\000\000\000\000`\326\222\226&\177\000\000`\326\222\226&\177", '\000' <repeats 18 times>, "`\326\222\226&\177\000\000`\326\222\226&\177", '\000' <repeats 18 times>, "`\326\222\226&\177\000\000\327\021K\226&\177"...
---Type <return> to continue, or q <return> to quit---
        pos = <value optimized out>
        rval = false
#17 0x00000000008b6e20 in AdminSocket::entry (this=0x4a80000) at common/admin_socket.cc:252
        fds = {{fd = 7, events = 129, revents = 1}, {fd = 5, events = 129, revents = 0}}
        ret = <value optimized out>
#18 0x00007f2695d157f1 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#19 0x00007f2694ca3ccd in clone () from /lib64/libc.so.6
No symbol table info available.

It is not necessary to do such strict checking in admin socket dump command. It is better to return some hints rather than crashing mds.

Actions

Also available in: Atom PDF