Project

General

Profile

Actions

Bug #19427

closed

common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())

Added by Sage Weil about 7 years ago. Updated over 6 years ago.

Status:
Duplicate
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
jewel, kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

   -35> 2017-03-29 22:36:36.972567 7f2538e96700 20 osd.2 756 should_share_map osd.1 172.21.15.117:6814/13151 756
   -34> 2017-03-29 22:36:36.972564 7f2546eb2700  1 -- 172.21.15.117:6808/1991 <== mon.2 172.21.15.117:6790/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 580+0+0 (1929996202 0 0) 0x565217318800 con 0x56521713b3c0
   -33> 2017-03-29 22:36:36.972571 7f2538e96700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] handle_message: 0x565216b7db00
   -32> 2017-03-29 22:36:36.972575 7f2538e96700 15 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] do_sub_op osd_sub_op(unknown.0.0:0 0.28 MIN [scrub-map] v 0'0 snapset=0=[]:[]) v12
   -31> 2017-03-29 22:36:36.972579 7f2538e96700  7 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] sub_op_scrub_map
   -30> 2017-03-29 22:36:36.972582 7f2538e96700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing]  got 1 scrub map
   -29> 2017-03-29 22:36:36.972586 7f2538e96700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] map version is 0'0
   -28> 2017-03-29 22:36:36.972589 7f2538e96700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] requeue_scrub: queueing
   -27> 2017-03-29 22:36:36.972591 7f2538e96700 20 osd.2 op_wq(0) _enqueue 0.28 PGQueueable(PGScrub prio 5 cost 52428800 e756)
   -26> 2017-03-29 22:36:36.972596 7f2538e96700 10 osd.2 756 dequeue_op 0x565216b7db00 finish
   -25> 2017-03-29 22:36:36.972600 7f253b69b700 20 osd.2 op_wq(0) _process 0.28 item PGQueueable(PGScrub prio 5 cost 52428800 e756) queued
   -24> 2017-03-29 22:36:36.972602 7f2538e96700 20 osd.2 op_wq(0) _process empty q, waiting
   -23> 2017-03-29 22:36:36.972608 7f253b69b700 20 osd.2 op_wq(0) _process 0.28 item PGQueueable(PGScrub prio 5 cost 52428800 e756) pg 0x565215be2000
   -22> 2017-03-29 22:36:36.972608 7f2546eb2700  1 -- 172.21.15.117:6808/1991 mark_down 0x565217138000 -- 0x56521562b400
   -21> 2017-03-29 22:36:36.972610 7f253b69b700 20 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub state WAIT_REPLICAS
   -20> 2017-03-29 22:36:36.972614 7f253b69b700 20 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub state COMPARE_MAPS
   -19> 2017-03-29 22:36:36.972617 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_compare_maps has maps, analyzing
   -18> 2017-03-29 22:36:36.972619 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_compare_maps  comparing replica scrub maps
   -17> 2017-03-29 22:36:36.972622 7f253b69b700  2 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_compare_maps   osd.2 has 0 items
   -16> 2017-03-29 22:36:36.972625 7f253b69b700  2 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_compare_maps replica 1 has 0 items
   -15> 2017-03-29 22:36:36.972629 7f253b69b700  2 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] 
   -14> 2017-03-29 22:36:36.972630 7f2546eb2700  1 monclient: found mon.c
   -13> 2017-03-29 22:36:36.972632 7f2546eb2700 10 monclient: _send_mon_message to mon.c at 172.21.15.117:6790/0
   -12> 2017-03-29 22:36:36.972632 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_snapshot_metadata
   -11> 2017-03-29 22:36:36.972633 7f2546eb2700  1 -- 172.21.15.117:6808/1991 --> 172.21.15.117:6790/0 -- mon_subscribe({mgrmap=0+,monmap=3+,osd_pg_creates=0+,osdmap=757}) v2 -- ?+0 0x565216867400 con 0x56521713b3c0
   -10> 2017-03-29 22:36:36.972636 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_snapshot_metadata (scrub) finish
    -9> 2017-03-29 22:36:36.972639 7f2546eb2700 10 monclient: _send_mon_message to mon.c at 172.21.15.117:6790/0
    -8> 2017-03-29 22:36:36.972642 7f2546eb2700  1 -- 172.21.15.117:6808/1991 --> 172.21.15.117:6790/0 -- pg_stats(39 pgs tid 177 v 0) v1 -- ?+0 0x565215f23e40 con 0x56521713b3c0
    -7> 2017-03-29 22:36:36.972640 7f253b69b700 20 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub state FINISH
    -6> 2017-03-29 22:36:36.972645 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub got 0/0 objects, 0/0 clones, 0/0 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 bytes, 0/0 hit_set_archive bytes.
    -5> 2017-03-29 22:36:36.972654 7f2546eb2700 10 monclient: _send_mon_message to mon.c at 172.21.15.117:6790/0
    -4> 2017-03-29 22:36:36.972652 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_process_inconsistent: checking authoritative
    -3> 2017-03-29 22:36:36.972656 7f2546eb2700  1 -- 172.21.15.117:6808/1991 --> 172.21.15.117:6790/0 -- pg_stats(39 pgs tid 178 v 0) v1 -- ?+0 0x5652197f19c0 con 0x56521713b3c0
    -2> 2017-03-29 22:36:36.972659 7f253b69b700  0 log_channel(cluster) log [INF] : 0.28 scrub ok
    -1> 2017-03-29 22:36:36.972660 7f2546eb2700 10 log_client  log_queue is 3 last_log 600 sent 596 num 3 unsent 4 sending 4
     0> 2017-03-29 22:36:36.976280 7f2546eb2700 -1 /build/ceph-12.0.0-2001-gcc51f56/src/common/LogClient.cc: In function 'Message* LogClient::_get_mon_log_message()' thread 7f2546eb2700 time 2017-03-29 22:36:36.972662
/build/ceph-12.0.0-2001-gcc51f56/src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())

 ceph version 12.0.0-2001-gcc51f56 (cc51f56bb2c153bf35078eb95311e1d35c118c9f)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x56520b4d28b2]
 2: (LogClient::_get_mon_log_message()+0x13e5) [0x56520b482d25]
 3: (LogClient::get_mon_log_message()+0x1f) [0x56520b482ecf]
 4: (MonClient::send_log()+0x15) [0x56520b4e64a5]
 5: (MonClient::handle_auth(MAuthReply*)+0x348) [0x56520b4ef488]
 6: (MonClient::ms_dispatch(Message*)+0x39b) [0x56520b4efa9b]
 7: (DispatchQueue::entry()+0xf4a) [0x56520b69e94a]
 8: (DispatchQueue::DispatchThread::entry()+0xd) [0x56520b55a33d]
 9: (()+0x770a) [0x7f25590bd70a]
 10: (clone()+0x6d) [0x7f255813482d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

this was on wip-mgr-commands, but no changes were made the osd or logclient, so I'm guessing this is related to the monclient multiconnection refactor?

/a/sage-2017-03-29_21:10:58-rados-wip-mgr-commands---basic-smithi/962237


Related issues 4 (0 open4 closed)

Has duplicate Ceph - Bug #20680: osd: FAILED assert(num_unsent <= log_queue.size())Closed07/18/2017

Actions
Is duplicate of RADOS - Bug #18209: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())Duplicate12/09/2016

Actions
Copied to Ceph - Backport #19617: jewel: common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())ResolvedAlexey SheplyakovActions
Copied to Ceph - Backport #19618: kraken: common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())ResolvedAlexey SheplyakovActions
Actions #1

Updated by Sage Weil about 7 years ago

  • Description updated (diff)
Actions #2

Updated by Kefu Chai about 7 years ago

  • Assignee set to Kefu Chai
Actions #3

Updated by Sage Weil about 7 years ago

/a/sage-2017-04-07_04:46:46-rados:thrash-wip-bluestore-osr-drain-hang---basic-smithi/995356

Actions #4

Updated by Kefu Chai about 7 years ago

it's a race of LogChannel::do_log() and the "log_client->reset_session(); send_log()" MonClient::handle_auth(). this is not introduced by the MonClient multi-con change. and it is also in jewel.

    -2> 2017-03-29 22:36:36.972659 7f253b69b700  0 log_channel(cluster) log [INF] : 0.28 scrub ok
    -1> 2017-03-29 22:36:36.972660 7f2546eb2700 10 log_client  log_queue is 3 last_log 600 sent 596 num 3 unsent 4 sending 4
     0> 2017-03-29 22:36:36.976280 7f2546eb2700 -1 /build/ceph-12.0.0-2001-gcc51f56/src/common/LogClient.cc: In function 'Message* LogClient::_get_mon_log_message()' thread 7f2546eb2700 time 2017-03-29 22:36:36.972662
/build/ceph-12.0.0-2001-gcc51f56/src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
  1. reset_session() updates the "last_log_sent"
  2. osd->clog->info() calls do_log, which calls
    1. LogClient::get_next_seq() // last_log is atomic, and without log_lock
    2. LogClient::queue() // with log_lock
  3. MonClient::send_log() calls LogClient::get_mon_log_message()
    1. LogClient::get_mon_log_message() calls _get_mon_log_message() with log_lock

but osd->clog->info() is not atomic in the sense that get_mon_log_message could be called after LogClient::last_log is updated by LogClient::get_next_seq() and before LogClient::queue(), that's why num_unsent was 4 and log_queue.size() is 3, which should have been 4 right after the LogClient::queue() call.

Actions #5

Updated by Kefu Chai about 7 years ago

  • Backport set to jewel, kraken
Actions #6

Updated by Kefu Chai about 7 years ago

  • Status changed from New to Fix Under Review
Actions #7

Updated by Sage Weil about 7 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #8

Updated by Nathan Cutler about 7 years ago

  • Copied to Backport #19617: jewel: common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size()) added
Actions #9

Updated by Nathan Cutler about 7 years ago

  • Copied to Backport #19618: kraken: common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size()) added
Actions #10

Updated by Nathan Cutler almost 7 years ago

  • Status changed from Pending Backport to Resolved
Actions #11

Updated by Sage Weil almost 7 years ago

  • Status changed from Resolved to In Progress
  • Priority changed from Urgent to Immediate

this popped up again!

2017-04-28T19:11:13.357 INFO:tasks.ceph.osd.4.smithi117.stderr:/build/ceph-12.0.1-1689-gcd86908/src/common/LogClient.cc: In function 'Message* LogClient::_get_mon_log_message()' thread 7fab5b2dd700 time 2017-04-28 19:11:13.360971
2017-04-28T19:11:13.357 INFO:tasks.ceph.osd.4.smithi117.stderr:/build/ceph-12.0.1-1689-gcd86908/src/common/LogClient.cc: 308: FAILED assert(num_unsent <= log_queue.size())
2017-04-28T19:11:13.365 INFO:tasks.ceph.osd.4.smithi117.stderr: ceph version 12.0.1-1689-gcd86908 (cd86908220578fbe091355ce3d8d18166e4cb352)
2017-04-28T19:11:13.365 INFO:tasks.ceph.osd.4.smithi117.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55b2cda00552]
2017-04-28T19:11:13.365 INFO:tasks.ceph.osd.4.smithi117.stderr: 2: (LogClient::_get_mon_log_message()+0x13e5) [0x55b2cd9d1745]
2017-04-28T19:11:13.365 INFO:tasks.ceph.osd.4.smithi117.stderr: 3: (LogClient::get_mon_log_message(bool)+0x25) [0x55b2cd9d18f5]
2017-04-28T19:11:13.365 INFO:tasks.ceph.osd.4.smithi117.stderr: 4: (MonClient::send_log(bool)+0x19) [0x55b2cda14149]
2017-04-28T19:11:13.365 INFO:tasks.ceph.osd.4.smithi117.stderr: 5: (MonClient::tick()+0x3c1) [0x55b2cda1db01]
2017-04-28T19:11:13.365 INFO:tasks.ceph.osd.4.smithi117.stderr: 6: (Context::complete(int)+0x9) [0x55b2cd491429]
2017-04-28T19:11:13.365 INFO:tasks.ceph.osd.4.smithi117.stderr: 7: (SafeTimer::timer_thread()+0x452) [0x55b2cd9fc7d2]
2017-04-28T19:11:13.366 INFO:tasks.ceph.osd.4.smithi117.stderr: 8: (SafeTimerThread::entry()+0xd) [0x55b2cd9fdbdd]
2017-04-28T19:11:13.366 INFO:tasks.ceph.osd.4.smithi117.stderr: 9: (()+0x770a) [0x7fab70e0a70a]
2017-04-28T19:11:13.366 INFO:tasks.ceph.osd.4.smithi117.stderr: 10: (clone()+0x6d) [0x7fab6fe8182d]
2017-04-28T19:11:13.366 INFO:tasks.ceph.osd.4.smithi117.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

/a/sage-2017-04-28_16:57:50-rados-wip-sage-testing2---basic-smithi/1077110

Actions #12

Updated by Kefu Chai almost 7 years ago

 -4704> 2017-04-28 19:11:13.360957 7fab532cd700  0 log_channel(cluster) log [INF] : osd_enable_op_tracker = 'true'
 -4703> 2017-04-28 19:11:13.360962 7fab5b2dd700 10 log_client  log_queue is 7 last_log 94 sent 86 num 7 unsent 8 sending 8

racing again: LogChannel::do_log() and LogClient::get_mon_log_message(false)

Actions #13

Updated by Kefu Chai almost 7 years ago

  • Status changed from In Progress to Fix Under Review
Actions #14

Updated by Kefu Chai almost 7 years ago

  • Status changed from Fix Under Review to Resolved
Actions #15

Updated by Kefu Chai almost 7 years ago

  • Status changed from Resolved to Pending Backport
Actions #16

Updated by Sage Weil almost 7 years ago

  • Status changed from Pending Backport to 12

This is still happening:

2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr:     0> 2017-05-16 23:55:22.412915 7f5ea2d8e700 -1 /build/ceph-12.0.1-2423-ge3b049c/src/common/LogClient.cc: In function 'Message* LogClient::_get_mon_log_message()' thread 7f5ea2d8e700 time 2017-05-16 23:55:22.410335
2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr:/build/ceph-12.0.1-2423-ge3b049c/src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr:
2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr: ceph version 12.0.1-2423-ge3b049c (e3b049cc24496285b07728c70033b1bd7a1e592a)
2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x5653aafdac72]
2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr: 2: (LogClient::_get_mon_log_message()+0x13e5) [0x5653aafabb45]
2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr: 3: (LogClient::get_mon_log_message(bool)+0x40) [0x5653aafabd10]
2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr: 4: (MonClient::send_log(bool)+0x19) [0x5653aafee869]
2017-05-16T23:55:22.468 INFO:tasks.ceph.osd.5.smithi198.stderr: 5: (MonClient::tick()+0x3c1) [0x5653aaff81a1]
2017-05-16T23:55:22.469 INFO:tasks.ceph.osd.5.smithi198.stderr: 6: (Context::complete(int)+0x9) [0x5653aaa5e669]
2017-05-16T23:55:22.469 INFO:tasks.ceph.osd.5.smithi198.stderr: 7: (SafeTimer::timer_thread()+0x452) [0x5653aafd6ea2]
2017-05-16T23:55:22.469 INFO:tasks.ceph.osd.5.smithi198.stderr: 8: (SafeTimerThread::entry()+0xd) [0x5653aafd82ad]
2017-05-16T23:55:22.469 INFO:tasks.ceph.osd.5.smithi198.stderr: 9: (()+0x76ba) [0x7f5eb80bd6ba]
2017-05-16T23:55:22.469 INFO:tasks.ceph.osd.5.smithi198.stderr: 10: (clone()+0x6d) [0x7f5eb713482d]

/a/sage-2017-05-16_22:22:40-rados-wip-sage-testing2---basic-smithi/1186056

verified the commit contains the previous fix

gnit:build (wip-sage-testing2) 10:28 AM $ git branch --contains 4da64aa
g  master
  wip-bluestore-aio-warning
  wip-gui
  wip-list-objects-more
  wip-luminous-legacy-check
  wip-mempool-perf
  wip-mon-kvtype
  wip-mon-summary
  wip-osdmap-require
  wip-sage-testing
* wip-sage-testing2
gnit:build (wip-sage-testing2) 10:28 AM $ git rev-parse wip-sage-testing2
e3b049cc24496285b07728c70033b1bd7a1e592a

Actions #17

Updated by Sage Weil almost 7 years ago

    -1> 2017-06-05 17:19:59.922592 7fa504dfd700 10 log_client  log_queue is 4 last_log 4215 sent 4210 num 4 unsent 5 sending 5
     0> 2017-06-05 17:19:59.928726 7fa504dfd700 -1 /build/ceph-12.0.2-2248-g8c17cc5/src/common/LogClient.cc: In function 'Message* LogClient::_get_mon_log_message()' thread 7fa504dfd700 time 2017-06-05 17:19:59.922597
/build/ceph-12.0.2-2248-g8c17cc5/src/common/LogClient.cc: 293: FAILED assert(num_unsent <= log_queue.size())

 ceph version  12.0.2-2248-g8c17cc5 (8c17cc5e289cdd593b537d3b627451e9483d742b) luminous (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563a70c78f72]
 2: (LogClient::_get_mon_log_message()+0x1502) [0x563a70c48e72]
 3: (LogClient::get_mon_log_message(bool)+0x43) [0x563a70c49023]
 4: (MonClient::send_log(bool)+0x19) [0x563a70c8c649]
 5: (MonClient::tick()+0x3c1) [0x563a70c96a11]
 6: (Context::complete(int)+0x9) [0x563a706f12a9]
 7: (SafeTimer::timer_thread()+0x452) [0x563a70c75092]
 8: (SafeTimerThread::entry()+0xd) [0x563a70c7649d]
 9: (()+0x76ba) [0x7fa51d8d96ba]
 10: (clone()+0x6d) [0x7fa51c95082d]

osd.5

/a/sage-2017-06-05_14:47:27-rados-wip-sage-testing-distro-basic-smithi/1260414

Actions #18

Updated by Kefu Chai almost 7 years ago

  • Status changed from 12 to Need More Info

the osd.5 log is huge, and i cannot use zless to find the log lines i am interested. would be great if we can have another smaller log with this failure.

Actions #19

Updated by Sage Weil almost 7 years ago

  • Priority changed from Immediate to Urgent
Actions #20

Updated by Sage Weil almost 7 years ago

  • Priority changed from Urgent to High
Actions #21

Updated by Patrick Donnelly almost 7 years ago

  • Has duplicate Bug #20680: osd: FAILED assert(num_unsent <= log_queue.size()) added
Actions #23

Updated by Kefu Chai almost 7 years ago

  • Status changed from Need More Info to 12

will take a look again tmr.

Actions #24

Updated by Sage Weil over 6 years ago

  • Status changed from 12 to Duplicate
Actions #25

Updated by Sage Weil over 6 years ago

  • Is duplicate of Bug #18209: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size()) added
Actions

Also available in: Atom PDF