Project

General

Profile

Bug #18209

src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())

Added by Zheng Yan almost 2 years ago. Updated 4 months ago.

Status:
Verified
Priority:
High
Assignee:
-
Category:
-
Target version:
-
Start date:
12/09/2016
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:

Description

(gdb) bt
#0  0x00007f774361db09 in raise () from /lib64/libpthread.so.0
#1  0x0000555b38c9ce5f in reraise_fatal (signum=6) at /home/zhyan/Ceph/ceph/src/global/signal_handler.cc:72
#2  handle_fatal_signal (signum=6) at /home/zhyan/Ceph/ceph/src/global/signal_handler.cc:134
#3  <signal handler called>
#4  0x00007f77423b2765 in raise () from /lib64/libc.so.6
#5  0x00007f77423b436a in abort () from /lib64/libc.so.6
#6  0x0000555b38d5e4ba in ceph::__ceph_assert_fail (assertion=assertion@entry=0x555b3903b410 "num_unsent <= log_queue.size()", 
    file=file@entry=0x555b3903b3e0 "/home/zhyan/Ceph/ceph/src/common/LogClient.cc", line=line@entry=310, 
    func=func@entry=0x555b3903ba60 <LogClient::_get_mon_log_message()::__PRETTY_FUNCTION__> "Message* LogClient::_get_mon_log_message()") at /home/zhyan/Ceph/ceph/src/common/assert.cc:78
#7  0x0000555b38d13c4d in LogClient::_get_mon_log_message (this=this@entry=0x555b42e92720) at /home/zhyan/Ceph/ceph/src/common/LogClient.cc:310
#8  0x0000555b38d13e30 in LogClient::get_mon_log_message (this=0x555b42e92720) at /home/zhyan/Ceph/ceph/src/common/LogClient.cc:276
#9  0x0000555b38d6598d in MonClient::send_log (this=this@entry=0x7ffcad85daf0) at /home/zhyan/Ceph/ceph/src/mon/MonClient.cc:308
#10 0x0000555b38d71538 in MonClient::tick (this=0x7ffcad85daf0) at /home/zhyan/Ceph/ceph/src/mon/MonClient.cc:750
#11 0x0000555b3898d1c1 in Context::complete (this=0x555b42cf8210, r=<optimized out>) at /home/zhyan/Ceph/ceph/src/include/Context.h:70
#12 0x0000555b38d5b057 in SafeTimer::timer_thread (this=0x7ffcad85dcd0) at /home/zhyan/Ceph/ceph/src/common/Timer.cc:105
#13 0x0000555b38d5c55d in SafeTimerThread::entry (this=<optimized out>) at /home/zhyan/Ceph/ceph/src/common/Timer.cc:38
#14 0x00007f77436145ca in start_thread () from /lib64/libpthread.so.0
#15 0x00007f77424810ed in clone () from /lib64/libc.so.6
(gdb) f 7
#7  0x0000555b38d13c4d in LogClient::_get_mon_log_message (this=this@entry=0x555b42e92720) at /home/zhyan/Ceph/ceph/src/common/LogClient.cc:310
310      assert(num_unsent <= log_queue.size());
(gdb) p num_unsent
$1 = 13
(gdb) p log_queue
$2 = std::deque with 12 elements = {{who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, 
        static NEW = -1}, addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, 
          sin = {sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259256, 
        tv_nsec = 314821762}}, seq = 2496, prio = CLOG_WARN, msg = "246 slow requests, 5 included below; oldest blocked for > 643.410362 secs", channel = "cluster"}, {who = {name = {_type = 2 '\002', 
        _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, 
        type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, 
            sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, 
                __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259256, tv_nsec = 314825775}}, seq = 2497, prio = CLOG_WARN, 
    msg = "slow request 126.027886 seconds old, received at 2016-12-09 12:52:10.286878: client_request(client.3364353:1854793 create #2000000c950/modules.order 2016-12-09 12:52:01.302771 caller_uid=0, caller_gid"..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259256, 
        tv_nsec = 314828401}}, seq = 2498, prio = CLOG_WARN, 
    msg = "slow request 39.843430 seconds old, received at 2016-12-09 12:53:36.471333: client_request(client.3364353:1854841 create #2000000c7d6/intel_renderstate_gen6.o 2016-12-09 12:53:34.667104 caller_uid=0, "..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259256, 
        tv_nsec = 314830635}}, seq = 2499, prio = CLOG_WARN, 
    msg = "slow request 32.464533 seconds old, received at 2016-12-09 12:53:43.850230: client_request(client.3364356:507234 unlink #500000057ce/fdc7 2016-12-09 12:53:39.162758 caller_uid=0, caller_gid=0{}) curre"..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259256, 
        tv_nsec = 314833095}}, seq = 2500, prio = CLOG_WARN, 
    msg = "slow request 127.091682 seconds old, received at 2016-12-09 12:52:09.223081: client_request(client.3364353:1854760 create #2000000c922/built-in.o 2016-12-09 12:52:00.149964 caller_uid=0, caller_gid=0{"..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259256, 
        tv_nsec = 314834577}}, seq = 2501, prio = CLOG_WARN, 
    msg = "slow request 39.841822 seconds old, received at 2016-12-09 12:53:36.472941: client_request(client.3364353:1854842 create #2000000c798/ip6t_ipv6header.o 2016-12-09 12:53:34.669271 caller_uid=0, caller_"..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259261, 
        tv_nsec = 314911970}}, seq = 2502, prio = CLOG_WARN, msg = "248 slow requests, 5 included below; oldest blocked for > 648.410423 secs", channel = "cluster"}, {who = {name = {_type = 2 '\002', 
        _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, 
        type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, 
            sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, 
                __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259261, tv_nsec = 314917425}}, seq = 2503, prio = CLOG_WARN, 
    msg = "slow request 38.625573 seconds old, received at 2016-12-09 12:53:42.689252: client_request(client.3364353:1855287 create #2000000c7d6/.intel_renderstate_gen7.o.d 2016-12-09 12:53:41.404088 caller_uid="..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259261, 
---Type <return> to continue, or q <return> to quit---
        tv_nsec = 314920064}}, seq = 2504, prio = CLOG_WARN, 
    msg = "slow request 132.070803 seconds old, received at 2016-12-09 12:52:09.244021: client_request(client.3364353:1854761 create #2000000c922/.built-in.o.cmd 2016-12-09 12:52:00.172717 caller_uid=0, caller_g"..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259261, 
        tv_nsec = 314922847}}, seq = 2505, prio = CLOG_WARN, 
    msg = "slow request 130.850489 seconds old, received at 2016-12-09 12:52:10.464335: client_request(client.3364353:1854798 create #2000000c95a/built-in.o 2016-12-09 12:52:01.494544 caller_uid=0, caller_gid=0{"..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259261, 
        tv_nsec = 314925368}}, seq = 2506, prio = CLOG_WARN, 
    msg = "slow request 37.101413 seconds old, received at 2016-12-09 12:53:44.213412: client_request(client.3364353:1855288 create #2000000c7d6/.intel_renderstate_gen7.o.tmp 2016-12-09 12:53:43.055288 caller_ui"..., channel = "cluster"}, {who = {name = {_type = 2 '\002', _num = 1, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
      addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_LEGACY, type = 1, nonce = 3669258852, u = {sa = {sa_family = 2, sa_data = "\032\221\n\000\000\n\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 37146, sin_addr = {s_addr = 167772170}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 37146, sin6_flowinfo = 167772170, 
            sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, stamp = {tv = {tv_sec = 1481259261, 
        tv_nsec = 314927524}}, seq = 2507, prio = CLOG_WARN, msg = "slow request 300.300233 seconds old, received at 2016-12-09 12:49:21.014591: rejoin:client.3364356:507138 currently initiated", 
    channel = "cluster"}}
(gdb) 

Related issues

Duplicated by Ceph - Bug #19427: common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size()) Duplicate 03/30/2017
Copied to RADOS - Backport #20965: luminous: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size()) Resolved

History

#1 Updated by John Spray almost 2 years ago

  • Priority changed from Normal to High
  • Target version set to v12.0.0

#2 Updated by Patrick Donnelly over 1 year ago

  • Status changed from New to Need More Info

Zheng, what's the source for this bug? Any updates?

#3 Updated by Sage Weil over 1 year ago

/a/sage-2017-08-04_13:49:55-rbd:singleton-bluestore-wip-sage-testing2-20170803b-distro-basic-mira/1482623

 -5954> 2017-08-04 15:15:50.995347 7f72acfbc700  1 -- 172.21.6.134:6800/16814 <== mon.1 172.21.6.134:6789/0 65 ==== log(last 2023) v1 ==== 28+0+0 (1623768890 0 0) 0x7f72cb0e4800 con 0x7f72c63df800
 -5953> 2017-08-04 15:15:50.995369 7f72b1f09700  0 log_channel(cluster) log [WRN] : 564 slow requests, 5 included below; oldest blocked for > 82.661719 secs
 -5952> 2017-08-04 15:15:50.995373 7f72acfbc700 10 log_client handle_log_ack log(last 2023) v1
 -5951> 2017-08-04 15:15:50.995378 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:38.228455 osd.4 osd.4 172.21.6.134:6800/16814 1970 : cluster [WRN] 704 slow requests, 5 included below; oldest blocked for > 72.273067 secs
 -5950> 2017-08-04 15:15:50.995394 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:38.228459 osd.4 osd.4 172.21.6.134:6800/16814 1971 : cluster [WRN] slow request 47.706173 seconds old, received at 2017-08-04 15:14:50.522153: osd_op(client.4235.0:28942 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5949> 2017-08-04 15:15:50.995368 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] get_object_context: 0x7f72c61a5080 1
:a7a8f4c5:::rbd_object_map.108874b0dc51:head rwstate(excl n=1 w=381) oi: 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head(29'16267 client.4235.0:27936 dirty|omap_digest s 65630 uv 16267 od ffffffff alloc_hint [0 0 0]) exists: 1 ssc: 0x7f72c64a6f40 snapset: 0=[]:{}
 -5948> 2017-08-04 15:15:50.995408 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:38.228462 osd.4 osd.4 172.21.6.134:6800/16814 1972 : cluster [WRN] slow request 43.353842 seconds old, received at 2017-08-04 15:14:54.874484: osd_op(client.4235.0:29290 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5947> 2017-08-04 15:15:50.995422 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:38.228465 osd.4 osd.4 172.21.6.134:6800/16814 1973 : cluster [WRN] slow request 42.725092 seconds old, received at 2017-08-04 15:14:55.503234: osd_op(client.4235.0:29365 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5946> 2017-08-04 15:15:50.995435 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:38.228468 osd.4 osd.4 172.21.6.134:6800/16814 1974 : cluster [WRN] slow request 41.463179 seconds old, received at 2017-08-04 15:14:56.765148: osd_op(client.4235.0:29448 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5945> 2017-08-04 15:15:50.995420 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] find_object_context 1:a7a8f4c5:::rbd
_object_map.108874b0dc51:head @head oi=1:a7a8f4c5:::rbd_object_map.108874b0dc51:head(29'16267 client.4235.0:27936 dirty|omap_digest s 65630 uv 16267 od ffffffff alloc_hint [0 0 0])
 -5944> 2017-08-04 15:15:50.995448 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:38.228470 osd.4 osd.4 172.21.6.134:6800/16814 1975 : cluster [WRN] slow request 39.412618 seconds old, received at 2017-08-04 15:14:58.815708: osd_op(client.4235.0:29642 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5943> 2017-08-04 15:15:50.995462 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:39.228768 osd.4 osd.4 172.21.6.134:6800/16814 1976 : cluster [WRN] 695 slow requests, 5 included below; oldest blocked for > 73.151033 secs
 -5942> 2017-08-04 15:15:50.995454 7f72a16e8700 25 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] do_op oi 1:a7a8f4c5:::rbd_object_map
.108874b0dc51:head(29'16267 client.4235.0:27936 dirty|omap_digest s 65630 uv 16267 od ffffffff alloc_hint [0 0 0])
 -5941> 2017-08-04 15:15:50.995475 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:39.228773 osd.4 osd.4 172.21.6.134:6800/16814 1977 : cluster [WRN] slow request 60.377710 seconds old, received at 2017-08-04 15:14:38.850943: osd_op(client.4235.0:27987 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5940> 2017-08-04 15:15:50.995489 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:39.228775 osd.4 osd.4 172.21.6.134:6800/16814 1978 : cluster [WRN] slow request 60.164719 seconds old, received at 2017-08-04 15:14:39.063934: osd_op(client.4235.0:28004 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5939> 2017-08-04 15:15:50.995487 7f72a16e8700 20 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] do_op waiting for rw locks 
 -5938> 2017-08-04 15:15:50.995506 7f72a16e8700 10 osd.4 29 dequeue_op 0x7f72d207af00 finish
 -5937> 2017-08-04 15:15:50.995502 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:39.228778 osd.4 osd.4 172.21.6.134:6800/16814 1979 : cluster [WRN] slow request 60.371675 seconds old, received at 2017-08-04 15:14:38.856978: osd_op(client.4235.0:27988 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5936> 2017-08-04 15:15:50.995513 7f72a16e8700 20 osd.4 op_wq(0) _process 1.5 item PGQueueable(0x7f72d2d06100 prio 63 cost 82 e29) queued
 -5935> 2017-08-04 15:15:50.995518 7f72a16e8700 20 osd.4 op_wq(0) _process 1.5 item PGQueueable(0x7f72d2d06100 prio 63 cost 82 e29) pg 0x7f72c871c000
 -5934> 2017-08-04 15:15:50.995518 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:39.228780 osd.4 osd.4 172.21.6.134:6800/16814 1980 : cluster [WRN] slow request 60.367481 seconds old, received at 2017-08-04 15:14:38.861172: osd_op(client.4235.0:27989 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5933> 2017-08-04 15:15:50.995531 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:39.228782 osd.4 osd.4 172.21.6.134:6800/16814 1981 : cluster [WRN] slow request 60.159762 seconds old, received at 2017-08-04 15:14:39.068891: osd_op(client.4235.0:28005 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5932> 2017-08-04 15:15:50.995523 7f72a16e8700 10 osd.4 29 dequeue_op 0x7f72d2d06100 prio 63 cost 82 latency 60.879570 osd_op(client.4235.0:28912 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call lock.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) v8 pg pg[1.5( v 2
9'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean]
 -5931> 2017-08-04 15:15:50.995545 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:40.229088 osd.4 osd.4 172.21.6.134:6800/16814 1982 : cluster [WRN] 707 slow requests, 5 included below; oldest blocked for > 74.044934 secs
 -5930> 2017-08-04 15:15:50.995558 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:40.229095 osd.4 osd.4 172.21.6.134:6800/16814 1983 : cluster [WRN] slow request 60.511365 seconds old, received at 2017-08-04 15:14:39.717605: osd_op(client.4235.0:28020 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5929> 2017-08-04 15:15:50.995572 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:40.229098 osd.4 osd.4 172.21.6.134:6800/16814 1984 : cluster [WRN] slow request 60.348211 seconds old, received at 2017-08-04 15:14:39.880759: osd_op(client.4235.0:28036 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5928> 2017-08-04 15:15:50.995557 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] _handle_message: 0x7f72d2d06100
 -5927> 2017-08-04 15:15:50.995584 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:40.229101 osd.4 osd.4 172.21.6.134:6800/16814 1985 : cluster [WRN] slow request 30.622963 seconds old, received at 2017-08-04 15:15:09.606006: osd_op(client.4235.0:30782 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5926> 2017-08-04 15:15:50.995597 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:40.229103 osd.4 osd.4 172.21.6.134:6800/16814 1986 : cluster [WRN] slow request 30.622166 seconds old, received at 2017-08-04 15:15:09.606804: osd_op(client.4235.0:30783 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5925> 2017-08-04 15:15:50.995611 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:40.229106 osd.4 osd.4 172.21.6.134:6800/16814 1987 : cluster [WRN] slow request 60.501052 seconds old, received at 2017-08-04 15:14:39.727917: osd_op(client.4235.0:28021 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5924> 2017-08-04 15:15:50.995590 7f72a16e8700 20 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] do_op: op osd_op(client.4235.0:28912
 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call lock.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) v8
 -5923> 2017-08-04 15:15:50.995625 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:41.229365 osd.4 osd.4 172.21.6.134:6800/16814 1988 : cluster [WRN] 704 slow requests, 5 included below; oldest blocked for > 74.970701 secs
 -5922> 2017-08-04 15:15:50.995637 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:41.229371 osd.4 osd.4 172.21.6.134:6800/16814 1989 : cluster [WRN] slow request 61.327709 seconds old, received at 2017-08-04 15:14:39.901555: osd_op(client.4235.0:28037 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5921> 2017-08-04 15:15:50.995650 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:41.229374 osd.4 osd.4 172.21.6.134:6800/16814 1990 : cluster [WRN] slow request 60.009527 seconds old, received at 2017-08-04 15:14:41.219737: osd_op(client.4235.0:28173 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5920> 2017-08-04 15:15:50.995631 7f72a16e8700 20 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] op_has_sufficient_caps session=0x7f7
2c89ab000 pool=1 (rbd ) owner=0 need_read_cap=0 need_write_cap=0 classes=[class lock rd 1 wr 0 wl 1,class rbd rd 1 wr 1 wl 1] -> yes
 -5919> 2017-08-04 15:15:50.995664 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:41.229376 osd.4 osd.4 172.21.6.134:6800/16814 1991 : cluster [WRN] slow request 74.880554 seconds old, received at 2017-08-04 15:14:26.348710: osd_op(client.4235.0:27806 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5918> 2017-08-04 15:15:50.995678 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:41.229379 osd.4 osd.4 172.21.6.134:6800/16814 1992 : cluster [WRN] slow request 74.638068 seconds old, received at 2017-08-04 15:14:26.591197: osd_op(client.4235.0:27822 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5917> 2017-08-04 15:15:50.995690 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:41.229381 osd.4 osd.4 172.21.6.134:6800/16814 1993 : cluster [WRN] slow request 74.408056 seconds old, received at 2017-08-04 15:14:26.821208: osd_op(client.4235.0:27839 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5916> 2017-08-04 15:15:50.995703 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:42.229800 osd.4 osd.4 172.21.6.134:6800/16814 1994 : cluster [WRN] 698 slow requests, 5 included below; oldest blocked for > 75.934519 secs
 -5915> 2017-08-04 15:15:50.995685 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] do_op osd_op(client.4235.0:28912 1.5
 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call lock.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) v8 may_write may_read -> write-ordered flags ondisk+write+known_if_redirected
 -5914> 2017-08-04 15:15:50.995716 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:42.229805 osd.4 osd.4 172.21.6.134:6800/16814 1995 : cluster [WRN] slow request 60.763828 seconds old, received at 2017-08-04 15:14:41.465860: osd_op(client.4235.0:28202 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5913> 2017-08-04 15:15:50.995724 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] get_object_context: found obc in cac
he: 0x7f72c61a5080
 -5912> 2017-08-04 15:15:50.995729 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:42.229807 osd.4 osd.4 172.21.6.134:6800/16814 1996 : cluster [WRN] slow request 60.423547 seconds old, received at 2017-08-04 15:14:41.806142: osd_op(client.4235.0:28224 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5911> 2017-08-04 15:15:50.995742 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:42.229809 osd.4 osd.4 172.21.6.134:6800/16814 1997 : cluster [WRN] slow request 60.738942 seconds old, received at 2017-08-04 15:14:41.490746: osd_op(client.4235.0:28204 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5910> 2017-08-04 15:15:50.995755 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:42.229811 osd.4 osd.4 172.21.6.134:6800/16814 1998 : cluster [WRN] slow request 60.419657 seconds old, received at 2017-08-04 15:14:41.810032: osd_op(client.4235.0:28225 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5909> 2017-08-04 15:15:50.995768 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:42.229814 osd.4 osd.4 172.21.6.134:6800/16814 1999 : cluster [WRN] slow request 75.272191 seconds old, received at 2017-08-04 15:14:26.957498: osd_op(client.4235.0:27855 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5908> 2017-08-04 15:15:50.995740 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] get_object_context: 0x7f72c61a5080 1
:a7a8f4c5:::rbd_object_map.108874b0dc51:head rwstate(excl n=1 w=382) oi: 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head(29'16267 client.4235.0:27936 dirty|omap_digest s 65630 uv 16267 od ffffffff alloc_hint [0 0 0]) exists: 1 ssc: 0x7f72c64a6f40 snapset: 0=[]:{}
 -5907> 2017-08-04 15:15:50.995807 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] find_object_context 1:a7a8f4c5:::rbd
_object_map.108874b0dc51:head @head oi=1:a7a8f4c5:::rbd_object_map.108874b0dc51:head(29'16267 client.4235.0:27936 dirty|omap_digest s 65630 uv 16267 od ffffffff alloc_hint [0 0 0])
 -5906> 2017-08-04 15:15:50.995804 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:43.230125 osd.4 osd.4 172.21.6.134:6800/16814 2000 : cluster [WRN] 675 slow requests, 5 included below; oldest blocked for > 76.633067 secs
 -5905> 2017-08-04 15:15:50.995833 7f72a16e8700 25 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] do_op oi 1:a7a8f4c5:::rbd_object_map
.108874b0dc51:head(29'16267 client.4235.0:27936 dirty|omap_digest s 65630 uv 16267 od ffffffff alloc_hint [0 0 0])
 -5904> 2017-08-04 15:15:50.995852 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:43.230131 osd.4 osd.4 172.21.6.134:6800/16814 2001 : cluster [WRN] slow request 60.951127 seconds old, received at 2017-08-04 15:14:42.278882: osd_op(client.4235.0:28240 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5903> 2017-08-04 15:15:50.995855 7f72a16e8700 20 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] do_op waiting for rw locks 
 -5902> 2017-08-04 15:15:50.995874 7f72a16e8700 10 osd.4 29 dequeue_op 0x7f72d2d06100 finish
 -5901> 2017-08-04 15:15:50.995868 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:43.230136 osd.4 osd.4 172.21.6.134:6800/16814 2002 : cluster [WRN] slow request 60.547902 seconds old, received at 2017-08-04 15:14:42.682107: osd_op(client.4235.0:28256 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5900> 2017-08-04 15:15:50.995880 7f72a16e8700 20 osd.4 op_wq(0) _process 1.5 item PGQueueable(0x7f72c92c9180 prio 63 cost 82 e29) queued
 -5899> 2017-08-04 15:15:50.995886 7f72a16e8700 20 osd.4 op_wq(0) _process 1.5 item PGQueueable(0x7f72c92c9180 prio 63 cost 82 e29) pg 0x7f72c871c000
 -5898> 2017-08-04 15:15:50.995881 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:43.230140 osd.4 osd.4 172.21.6.134:6800/16814 2003 : cluster [WRN] slow request 60.168964 seconds old, received at 2017-08-04 15:14:43.061045: osd_op(client.4235.0:28274 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5897> 2017-08-04 15:15:50.995893 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:43.230145 osd.4 osd.4 172.21.6.134:6800/16814 2004 : cluster [WRN] slow request 60.939156 seconds old, received at 2017-08-04 15:14:42.290853: osd_op(client.4235.0:28241 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5896> 2017-08-04 15:15:50.995890 7f72a16e8700 10 osd.4 29 dequeue_op 0x7f72c92c9180 prio 63 cost 82 latency 60.869120 osd_op(client.4235.0:28913 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call lock.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) v8 pg pg[1.5( v 2
9'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean]
 -5895> 2017-08-04 15:15:50.995905 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:43.230150 osd.4 osd.4 172.21.6.134:6800/16814 2005 : cluster [WRN] slow request 60.541462 seconds old, received at 2017-08-04 15:14:42.688548: osd_op(client.4235.0:28257 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5894> 2017-08-04 15:15:50.995920 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:44.230414 osd.4 osd.4 172.21.6.134:6800/16814 2006 : cluster [WRN] 628 slow requests, 5 included below; oldest blocked for > 76.752217 secs
 -5893> 2017-08-04 15:15:50.995920 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] _handle_message: 0x7f72c92c9180
 -5892> 2017-08-04 15:15:50.995926 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:44.230422 osd.4 osd.4 172.21.6.134:6800/16814 2007 : cluster [WRN] slow request 60.750195 seconds old, received at 2017-08-04 15:14:43.480132: osd_op(client.4235.0:28307 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5891> 2017-08-04 15:15:50.995928 7f72a16e8700 20 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] do_op: op osd_op(client.4235.0:28913
 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call lock.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) v8
 -5890> 2017-08-04 15:15:50.995933 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:44.230427 osd.4 osd.4 172.21.6.134:6800/16814 2008 : cluster [WRN] slow request 60.618640 seconds old, received at 2017-08-04 15:14:43.611686: osd_op(client.4235.0:28328 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5889> 2017-08-04 15:15:50.995940 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:44.230431 osd.4 osd.4 172.21.6.134:6800/16814 2009 : cluster [WRN] slow request 60.857472 seconds old, received at 2017-08-04 15:14:43.372855: osd_op(client.4235.0:28290 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5888> 2017-08-04 15:15:50.995940 7f72a16e8700 20 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] op_has_sufficient_caps session=0x7f7
2c89ab000 pool=1 (rbd ) owner=0 need_read_cap=0 need_write_cap=0 classes=[class lock rd 1 wr 0 wl 1,class rbd rd 1 wr 1 wl 1] -> yes
 -5887> 2017-08-04 15:15:50.995953 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:44.230434 osd.4 osd.4 172.21.6.134:6800/16814 2010 : cluster [WRN] slow request 60.393879 seconds old, received at 2017-08-04 15:14:43.836448: osd_op(client.4235.0:28349 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5886> 2017-08-04 15:15:50.995960 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:44.230439 osd.4 osd.4 172.21.6.134:6800/16814 2011 : cluster [WRN] slow request 60.743889 seconds old, received at 2017-08-04 15:14:43.486437: osd_op(client.4235.0:28308 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5885> 2017-08-04 15:15:50.995953 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] do_op osd_op(client.4235.0:28913 1.5
 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call lock.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) v8 may_write may_read -> write-ordered flags ondisk+write+known_if_redirected
 -5884> 2017-08-04 15:15:50.995963 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:45.230708 osd.4 osd.4 172.21.6.134:6800/16814 2012 : cluster [WRN] 587 slow requests, 5 included below; oldest blocked for > 77.040910 secs
 -5883> 2017-08-04 15:15:50.995973 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:45.230717 osd.4 osd.4 172.21.6.134:6800/16814 2013 : cluster [WRN] slow request 61.618104 seconds old, received at 2017-08-04 15:14:43.612526: osd_op(client.4235.0:28329 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5882> 2017-08-04 15:15:50.995982 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:45.230721 osd.4 osd.4 172.21.6.134:6800/16814 2014 : cluster [WRN] slow request 62.150209 seconds old, received at 2017-08-04 15:14:43.080421: osd_op(client.4235.0:28275 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5881> 2017-08-04 15:15:50.995975 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] get_object_context: found obc in cac
he: 0x7f72c61a5080
 -5880> 2017-08-04 15:15:50.995989 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:45.230726 osd.4 osd.4 172.21.6.134:6800/16814 2015 : cluster [WRN] slow request 61.851485 seconds old, received at 2017-08-04 15:14:43.379144: osd_op(client.4235.0:28291 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5879> 2017-08-04 15:15:50.995996 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:45.230746 osd.4 osd.4 172.21.6.134:6800/16814 2016 : cluster [WRN] slow request 61.389125 seconds old, received at 2017-08-04 15:14:43.841505: osd_op(client.4235.0:28350 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5878> 2017-08-04 15:15:50.996002 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:45.230751 osd.4 osd.4 172.21.6.134:6800/16814 2017 : cluster [WRN] slow request 61.556101 seconds old, received at 2017-08-04 15:14:43.674529: osd_op(client.4235.0:28330 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5877> 2017-08-04 15:15:50.996009 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:46.231101 osd.4 osd.4 172.21.6.134:6800/16814 2018 : cluster [WRN] 568 slow requests, 5 included below; oldest blocked for > 77.931409 secs
 -5876> 2017-08-04 15:15:50.996015 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:46.231108 osd.4 osd.4 172.21.6.134:6800/16814 2019 : cluster [WRN] slow request 62.388363 seconds old, received at 2017-08-04 15:14:43.842589: osd_op(client.4235.0:28351 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5875> 2017-08-04 15:15:50.995991 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] get_object_context: 0x7f72c61a5080 1
:a7a8f4c5:::rbd_object_map.108874b0dc51:head rwstate(excl n=1 w=383) oi: 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head(29'16267 client.4235.0:27936 dirty|omap_digest s 65630 uv 16267 od ffffffff alloc_hint [0 0 0]) exists: 1 ssc: 0x7f72c64a6f40 snapset: 0=[]:{}
 -5874> 2017-08-04 15:15:50.996021 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:46.231111 osd.4 osd.4 172.21.6.134:6800/16814 2020 : cluster [WRN] slow request 77.897424 seconds old, received at 2017-08-04 15:14:28.333528: osd_op(client.4235.0:27936 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5873> 2017-08-04 15:15:50.996028 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:46.231113 osd.4 osd.4 172.21.6.134:6800/16814 2021 : cluster [WRN] slow request 77.771194 seconds old, received at 2017-08-04 15:14:28.459758: osd_op(client.4235.0:27954 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5872> 2017-08-04 15:15:50.996035 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:46.231116 osd.4 osd.4 172.21.6.134:6800/16814 2022 : cluster [WRN] slow request 75.844501 seconds old, received at 2017-08-04 15:14:30.386451: osd_op(client.4235.0:27972 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5871> 2017-08-04 15:15:50.996024 7f72a16e8700 10 osd.4 pg_epoch: 29 pg[1.5( v 29'16267 (29'14700,29'16267] local-lis/les=18/19 n=1 ec=18/18 lis/c 18/18 les/c/f 19/19/0 18/18/18) [4,8] r=0 lpr=18 luod=29'16266 lua=29'16266 crt=29'16267 lcod 29'16265 mlcod 29'16265 active+clean] find_object_context 1:a7a8f4c5:::rbd
_object_map.108874b0dc51:head @head oi=1:a7a8f4c5:::rbd_object_map.108874b0dc51:head(29'16267 client.4235.0:27936 dirty|omap_digest s 65630 uv 16267 od ffffffff alloc_hint [0 0 0])
 -5870> 2017-08-04 15:15:50.996042 7f72acfbc700 10 log_client  logged 2017-08-04 15:15:46.231118 osd.4 osd.4 172.21.6.134:6800/16814 2023 : cluster [WRN] slow request 67.367446 seconds old, received at 2017-08-04 15:14:38.863506: osd_op(client.4235.0:27990 1.5 1:a7a8f4c5:::rbd_object_map.108874b0dc51:head [call loc
k.assert_locked,call rbd.object_map_update] snapc 0=[] ondisk+write+known_if_redirected e29) currently waiting for rw locks
 -5869> 2017-08-04 15:15:50.996055 7f72acfbc700 10 log_client  log_queue is 12 last_log 2036 sent 2023 num 12 unsent 13 sending 13

...
  -715> 2017-08-04 15:15:52.312376 7f72acfbc700 -1 /build/ceph-12.1.2-152-g704c9a7/src/common/LogClient.cc: In function 'Message* LogClient::_get_mon_log_message()' thread 7f72acfbc700 time 2017-08-04 15:15:50.996060
/build/ceph-12.1.2-152-g704c9a7/src/common/LogClient.cc: 293: FAILED assert(num_unsent <= log_queue.size())

 ceph version 12.1.2-152-g704c9a7 (704c9a79d0f493e55c4958072b857f5e9474c24b) luminous (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x7f72bb6a742e]
 2: (LogClient::_get_mon_log_message()+0xad1) [0x7f72bb67a361]
 3: (LogClient::get_mon_log_message(bool)+0x44) [0x7f72bb67a3c4]
 4: (MonClient::send_log(bool)+0x19) [0x7f72bb6b9ba9]
 5: (MonClient::ms_dispatch(Message*)+0x3b0) [0x7f72bb6c3e00]
 6: (DispatchQueue::entry()+0x78b) [0x7f72bb906f2b]
 7: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f72bb72ca7d]
 8: (()+0x8184) [0x7f72b916b184]
 9: (clone()+0x6d) [0x7f72b825b37d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

/a/sage-2017-08-04_13:49:55-rbd:singleton-bluestore-wip-sage-testing2-20170803b-distro-basic-mira/1482623

#4 Updated by Sage Weil over 1 year ago

  • Status changed from Need More Info to Need Review

#5 Updated by Sage Weil over 1 year ago

  • Project changed from fs to RADOS

#6 Updated by Sage Weil over 1 year ago

  • Duplicated by Bug #19427: common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size()) added

#7 Updated by Sage Weil over 1 year ago

  • Status changed from Need Review to Pending Backport

#8 Updated by Sage Weil over 1 year ago

  • Backport set to luminous

#9 Updated by Nathan Cutler over 1 year ago

  • Copied to Backport #20965: luminous: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size()) added

#10 Updated by Nathan Cutler about 1 year ago

  • Status changed from Pending Backport to Resolved

#11 Updated by Josh Durgin 4 months ago

  • Status changed from Resolved to Verified
  • Target version deleted (v12.0.0)

Happened again in 12.2.4:

 1: (()+0xa3c011) [0x563d5d9e3011]
 2: (()+0xf680) [0x7fbe9d850680]
 3: (gsignal()+0x37) [0x7fbe9c871207]
 4: (abort()+0x148) [0x7fbe9c8728f8]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x563d5da21b14]
 6: (LogClient::_get_mon_log_message()+0xb04) [0x563d5d9f2264]
 7: (LogClient::get_mon_log_message(bool)+0x44) [0x563d5d9f22e4]
 8: (MonClient::send_log(bool)+0x19) [0x563d5da34c19]
 9: (MonClient::tick()+0x402) [0x563d5da3f8c2]
 10: (Context::complete(int)+0x9) [0x563d5d4caef9]
 11: (SafeTimer::timer_thread()+0x104) [0x563d5da1cc24]
 12: (SafeTimerThread::entry()+0xd) [0x563d5da1e64d]
 13: (()+0x7dd5) [0x7fbe9d848dd5]
 14: (clone()+0x6d) [0x7fbe9c939b3d]

#12 Updated by Josh Durgin 4 months ago

    -3> 2018-07-10 10:17:35.009392 7fbe82f76700 10 monclient: tick
    -2> 2018-07-10 10:17:35.009422 7fbe82f76700 10 monclient: _check_auth_rotating have uptodate secrets (they expire 
after 2018-07-10 10:17:05.009418)
    -1> 2018-07-10 10:17:35.009449 7fbe82f76700 10 log_client  log_queue is 1 last_log 2 sent 0 num 1 unsent 2 sending
 2
     0> 2018-07-10 10:17:35.013665 7fbe82f76700 -1 /builddir/build/BUILD/ceph-12.2.4/src/common/LogClient.cc: In funct
ion 'Message* LogClient::_get_mon_log_message()' thread 7fbe82f76700 time 2018-07-10 10:17:35.009470
/builddir/build/BUILD/ceph-12.2.4/src/common/LogClient.cc: 294: FAILED assert(num_unsent <= log_queue.size())

#13 Updated by David Young 4 months ago

Noting the same issue, per ceph-users list post:

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-July/028085.html

Also available in: Atom PDF