Project

General

Profile

Actions

Bug #1415

closed

cosd assertion: existing->state == STATE_CONNECTING || existing->state == STATE_OPEN

Added by Sam Lang over 12 years ago. Updated over 12 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

end of the log:

2011-08-18 19:45:52.794270 7f90a6161700 -- 192.168.101.11:6803/12333 >> 192.168.101.112:6819/10241 pipe(0xf9a4500 sd=66 pgs=1665 cs=2 l=0).connect got RESETSESSION
2011-08-18 19:45:53.665509 7f90a1b1b700 -- 192.168.101.11:6803/12333 >> 192.168.101.14:6814/11168 pipe(0x25cafc80 sd=105 pgs=2690 cs=1 l=0).fault with nothing to send, going to standby
2011-08-18 19:46:01.808856 7f909a4a5700 -- 192.168.101.11:6803/12333 >> 192.168.101.14:6815/11168 pipe(0x26b7f000 sd=99 pgs=2724 cs=1 l=0).fault with nothing to send, going to standby
2011-08-18 19:46:02.713175 7f909d4d5700 -- 192.168.101.11:6801/12332 >> 192.168.101.14:6819/11168 pipe(0x21f2c500 sd=79 pgs=0 cs=0 l=0).accept connect_seq 0 vs existing 0 state 3
../../src/msg/SimpleMessenger.cc: In function 'int SimpleMessenger::Pipe::accept()', in thread '0x7f909d4d5700'
../../src/msg/SimpleMessenger.cc: 841: FAILED assert(existing->state STATE_CONNECTING || existing->state STATE_OPEN)
ceph version (commit:)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x89) [0x8f42ed]
2: (SimpleMessenger::Pipe::accept()+0x25bb) [0x8ac537]
3: (SimpleMessenger::Pipe::reader()+0x48) [0x8b3ca0]
4: (SimpleMessenger::Pipe::Reader::entry()+0x1c) [0x76f404]
5: (Thread::_entry_func(void*)+0x23) [0x917095]
6: (()+0x6d8c) [0x7f90b7741d8c]
7: (clone()+0x6d) [0x7f90b5f8304d]
ceph version (commit:)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x89) [0x8f42ed]
2: (SimpleMessenger::Pipe::accept()+0x25bb) [0x8ac537]
3: (SimpleMessenger::Pipe::reader()+0x48) [0x8b3ca0]
4: (SimpleMessenger::Pipe::Reader::entry()+0x1c) [0x76f404]
5: (Thread::_entry_func(void*)+0x23) [0x917095]
6: (()+0x6d8c) [0x7f90b7741d8c]
7: (clone()+0x6d) [0x7f90b5f8304d]
  • Caught signal (Aborted) *
    in thread 0x7f909d4d5700
    ceph version (commit:)
    1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x8f4669]
    2: /usr/ceph/bin/cosd() [0x9033db]
    3: (()+0xfc60) [0x7f90b774ac60]
    4: (gsignal()+0x35) [0x7f90b5ed0d05]
    5: (abort()+0x186) [0x7f90b5ed4ab6]
    6: (_gnu_cxx::_verbose_terminate_handler()+0x11d) [0x7f90b67876dd]
    7: (()+0xb9926) [0x7f90b6785926]
    8: (()+0xb9953) [0x7f90b6785953]
    9: (()+0xb9a5e) [0x7f90b6785a5e]
    10: (ceph::__ceph_assert_fail(char const
    , char const*, int, char const*)+0x1f3) [0x8f4457]
    11: (SimpleMessenger::Pipe::accept()+0x25bb) [0x8ac537]
    12: (SimpleMessenger::Pipe::reader()+0x48) [0x8b3ca0]
    13: (SimpleMessenger::Pipe::Reader::entry()+0x1c) [0x76f404]
    14: (Thread::_entry_func(void*)+0x23) [0x917095]
    15: (()+0x6d8c) [0x7f90b7741d8c]
    16: (clone()+0x6d) [0x7f90b5f8304d]

backtrace:

(gdb) bt
#0 0x00007f90b774ab3b in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x000000000090322e in reraise_fatal (signum=6) at ../../src/global/signal_handler.cc:59
#2 0x000000000090344c in handle_fatal_signal (signum=6) at ../../src/global/signal_handler.cc:106
#3 <signal handler called>
#4 0x00007f90b5ed0d05 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x00007f90b5ed4ab6 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00007f90b67876dd in _gnu_cxx::_verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007f90b6785926 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007f90b6785953 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007f90b6785a5e in _cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00000000008f4457 in ceph::
_ceph_assert_fail (assertion=0xa33ef0 "existing->state STATE_CONNECTING || existing->state STATE_OPEN", file=0xa33968 "../../src/msg/SimpleMessenger.cc", line=841,
func=0xa35e80 "int SimpleMessenger::Pipe::accept()") at ../../src/common/assert.cc:70
#11 0x00000000008ac537 in SimpleMessenger::Pipe::accept (this=0x21f2c500) at ../../src/msg/SimpleMessenger.cc:840
#12 0x00000000008b3ca0 in SimpleMessenger::Pipe::reader (this=0x21f2c500) at ../../src/msg/SimpleMessenger.cc:1546
#13 0x000000000076f404 in SimpleMessenger::Pipe::Reader::entry (this=0x21f2c730) at ../../src/msg/SimpleMessenger.h:205
#14 0x0000000000917095 in Thread::_entry_func (arg=0x21f2c730) at ../../src/common/Thread.cc:45
#15 0x00007f90b7741d8c in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#16 0x00007f90b5f8304d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#17 0x0000000000000000 in ?? ()


Related issues 2 (0 open2 closed)

Is duplicate of Messengers - Bug #1378: connection race - existing connection not open or connectingResolved08/08/2011

Actions
Has duplicate Ceph - Bug #1602: mon crash during startupDuplicate10/06/2011

Actions
Actions #1

Updated by Sage Weil over 12 years ago

  • Target version set to v0.35
  • Translation missing: en.field_position set to 30
Actions #2

Updated by Greg Farnum over 12 years ago

Do we know what version this was run on? This looks to me like the assert we saw when OSDs were connecting to themselves, although it was fixed in 8bcc639ab2171827286dafb42ef4635477dee8f1 prior to v0.31.

Or maybe you've found another way of hitting the same basic issue that we didn't guard against.

Actions #3

Updated by Sage Weil over 12 years ago

From the log it doesn't look like it's a connect to self. The interesting thing is that existing->state is STANDBY. Have to think a bit more about how we would get there.

Actions #4

Updated by Sage Weil over 12 years ago

  • Target version changed from v0.35 to v0.36
Actions #5

Updated by Sage Weil over 12 years ago

  • Target version deleted (v0.36)
  • Translation missing: en.field_position deleted (83)
  • Translation missing: en.field_position set to 2
Actions #6

Updated by Sage Weil over 12 years ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF