Actions
Bug #3675
closedosd: hang during intial peering
% Done:
0%
Source:
Q/A
Tags:
Backport:
bobtail
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
the initial wait for healthy blocked on 2 pgs. ms inject socket failres = 500. everything was up.
no logs, so it's unclear why.
there were lots of messages like
2012-12-21 14:15:45.285022 7fc27eca0700 0 -- 10.214.132.10:6807/5960 >> 10.214.132.10:6802/5955 pipe(0x302cb40 sd=33 :6807 pgs=8 cs=3 l=0).reader got old message 1 <= 3 0x2fb4c40 pg_query(2.13 epoch 4) v2, discarding 2012-12-21 14:15:45.285233 7fc27eca0700 0 -- 10.214.132.10:6807/5960 >> 10.214.132.10:6802/5955 pipe(0x302cb40 sd=33 :6807 pgs=8 cs=3 l=0).reader got old message 2 <= 3 0x2fb4c40 pg_notify(2.14 epoch 4) v4, discarding 2012-12-21 14:15:45.285342 7fc27eca0700 0 -- 10.214.132.10:6807/5960 >> 10.214.132.10:6802/5955 pipe(0x302cb40 sd=33 :6807 pgs=8 cs=3 l=0).reader got old message 3 <= 3 0x2fb4c40 pg_notify(0.16 epoch 4) v4, discarding
but no messages related to the hung pgs.
job was
ubuntu@teuthology:/a/sage-2012-12-21_13:39:03-regression-next-testing-basic/19821$ cat orig.config.yaml kernel: kdb: true sha1: ec18aeecd4de479601363849d489668d8f12410c nuke-on-error: true overrides: ceph: conf: global: ms inject socket failures: 500 fs: btrfs log-whitelist: - slow request sha1: accce830514c6b099eb0e00a8ae34396d14565a3 s3tests: branch: next workunit: sha1: accce830514c6b099eb0e00a8ae34396d14565a3 roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - - client.0 tasks: - chef: null - clock: null - ceph: null - ceph-fuse: null - workunit: clients: client.0: - rados/test_python.sh
but this was before the workunit ran, so just doing the cluster setup/teardown should reproduce eventually.
Updated by Sage Weil over 11 years ago
this is a messenger bug. if there is a socket error at the end of accept(), after the register_pipe(), we then fail and discard the queue. but it's possible that during that duration a racing thread queued an outgoing message on the connection... and we don't want that to get thrown out!
Updated by Sage Weil over 11 years ago
it took about 1500 iterations of this job to reproduce the hang:
ubuntu@teuthology:/a/sage-peer1/21827$ cat orig.config.yaml nuke-on-error: true overrides: ceph: branch: next conf: global: ms inject socket failures: 500 osd: debug ms: 20 debug osd: 20 fs: btrfs log-whitelist: - slow request roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 tasks: - clock: null - ceph: null
Updated by Sage Weil over 11 years ago
2012-12-22 19:37:06.020016 7f7705d02700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=0 cs=0 l=0).accept: setting up session_security. 2012-12-22 19:37:06.020021 7f770de1a700 10 osd.1 pg_epoch: 3 pg[2.5( empty local-les=0 n=0 ec=1 les/c 0/1 3/3/3) [1,4] r=0 lpr=3 pi=1-2/2 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: querying info from osd.4 2012-12-22 19:37:06.020034 7f7705d02700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=0 cs=0 l=0).accept new session 2012-12-22 19:37:06.020047 7f7705d02700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).accept success, connect_seq = 1, sending READY 2012-12-22 19:37:06.020063 7f7705d02700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).accept features 67108863 2012-12-22 19:37:06.020082 7f7705d02700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).register_pipe 2012-12-22 19:37:06.020104 7f7705d02700 0 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).injecting socket failure 2012-12-22 19:37:06.020109 7f770de1a700 10 -- 10.214.132.25:6804/15154 get_connection osd.4 10.214.132.24:6801/13245 existing 0x3ac6900 2012-12-22 19:37:06.020128 7f770de1a700 20 osd.1 3 _share_map_outgoing 0x3aaa6e0 don't know epoch, doing nothing 2012-12-22 19:37:06.020132 7f770de1a700 7 osd.1 3 do_queries querying osd.4 on 1 PGs 2012-12-22 19:37:06.020143 7f770de1a700 1 -- 10.214.132.25:6804/15154 --> 10.214.132.24:6801/13245 -- pg_query(2.5 epoch 3) v2 -- ?+0 0x3b448c0 con 0x3aaa6e0 2012-12-22 19:37:06.020137 7f7705d02700 2 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).fault 32: Broken pipe 2012-12-22 19:37:06.020155 7f770de1a700 20 -- 10.214.132.25:6804/15154 submit_message pg_query(2.5 epoch 3) v2 remote, 10.214.132.24:6801/13245, have pipe. 2012-12-22 19:37:06.020156 7f7705d02700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).fault already closed|closing 2012-12-22 19:37:06.020191 7f7705d02700 10 -- 10.214.132.25:6804/15154 queue_reap 0x3ac6900 2012-12-22 19:37:06.020210 7f771662b700 10 -- 10.214.132.25:6804/15154 reaper 2012-12-22 19:37:06.020224 7f771662b700 10 -- 10.214.132.25:6804/15154 reaper reaping pipe 0x3ac6900 10.214.132.24:6801/13245 2012-12-22 19:37:06.020235 7f771662b700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).discard_queue 2012-12-22 19:37:06.020232 7f770de1a700 10 osd.1 3 pg[2.5( empty local-les=0 n=0 ec=1 les/c 0/1 3/3/3) [1,4] r=0 lpr=3 pi=1-2/2 mlcod 0'0 peering] is new 2012-12-22 19:37:06.020248 7f771662b700 20 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0). discard 0x3b448c0 2012-12-22 19:37:06.020266 7f771662b700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).unregister_pipe 2012-12-22 19:37:06.020258 7f770de1a700 10 osd.1 pg_epoch: 3 pg[2.5( empty local-les=0 n=0 ec=1 les/c 0/1 3/3/3) [1,4] r=0 lpr=3 pi=1-2/2 mlcod 0'0 peering] notify (query_epoch:3, epoch_sent:3, info:2.5( empty local-les=0 n=0 ec=1 les/c 0/1 3/3/3)) from osd.5 2012-12-22 19:37:06.020205 7f7705d02700 10 -- 10.214.132.25:6804/15154 >> 10.214.132.24:6801/13245 pipe(0x3ac6900 sd=32 :6804 pgs=5 cs=1 l=0).reader done
the message is queued after fault... we don't check the pipe state or anything else after we take the pipe_lock in Pipe::send()
Updated by Sage Weil over 11 years ago
ok, this is actually also a race that can cause the register_pipe assert. the locking needs to be reworked here. putting a sleep(2) in fault() after pipe_lock unlock and msgr->lock lock got me
2012-12-22 21:37:30.360401 7fe85bfe7700 1 -- 10.3.64.22:0/21242 --> 10.3.64.22:6802/20741 -- osd_ping(ping e131 stamp 2012-12-22 21:37:30.360322) v2 -- ?+0 0x29dc510 con 0x7fe85403b900 2012-12-22 21:37:30.360553 7fe85bfe7700 20 -- 10.3.64.22:0/21242 submit_message osd_ping(ping e131 stamp 2012-12-22 21:37:30.360322) v2 remote, 10.3.64.22:6802/20741, had pipe, but it closed. 2012-12-22 21:37:30.360619 7fe85bfe7700 20 -- 10.3.64.22:0/21242 submit_message osd_ping(ping e131 stamp 2012-12-22 21:37:30.360322) v2 remote, 10.3.64.22:6802/20741, new pipe. 2012-12-22 21:37:30.360638 7fe85bfe7700 10 -- 10.3.64.22:0/21242 connect_rank to 10.3.64.22:6802/20741, creating pipe and registering 2012-12-22 21:37:30.361036 7fe85bfe7700 10 -- 10.3.64.22:0/21242 >> :/0 pipe(0x29dcc10 sd=-1 :0 s=1 pgs=0 cs=0 l=0).randomize_out_seq 732632941 2012-12-22 21:37:30.361324 7fe85bfe7700 10 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=-1 :0 s=1 pgs=0 cs=0 l=1).register_pipe 2012-12-22 21:37:30.361350 7fe86c6f7700 10 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=-1 :0 s=1 pgs=0 cs=0 l=1).writer: state = connecting policy.server=0 2012-12-22 21:37:30.361451 7fe86c6f7700 10 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=-1 :0 s=1 pgs=0 cs=0 l=1).connect 0 2012-12-22 21:37:30.361547 7fe86c6f7700 10 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).connecting to 10.3.64.22:6802/20741 2012-12-22 21:37:30.361773 7fe86c6f7700 2 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).connect error 10.3.64.22:6802/20741, 111: Connection refused 2012-12-22 21:37:30.361805 7fe86c6f7700 2 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).fault 111: Connection refused 2012-12-22 21:37:30.361840 7fe86c6f7700 0 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).fault 2012-12-22 21:37:30.361854 7fe86c6f7700 10 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).writer: state = connecting policy.server=0 2012-12-22 21:37:30.361865 7fe86c6f7700 10 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).connect 0 2012-12-22 21:37:30.361936 7fe86c6f7700 10 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).connecting to 10.3.64.22:6802/20741 2012-12-22 21:37:30.362004 7fe86c6f7700 2 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).connect error 10.3.64.22:6802/20741, 111: Connection refused 2012-12-22 21:37:30.362031 7fe86c6f7700 2 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).fault 111: Connection refused 2012-12-22 21:37:30.362052 7fe86c6f7700 10 -- 10.3.64.22:0/21242 >> 10.3.64.22:6802/20741 pipe(0x29dcc10 sd=34 :0 s=1 pgs=0 cs=0 l=1).fault waiting 0.200000 2012-12-22 21:37:30.380333 7fe85bfe7700 -1 msg/Pipe.cc: In function 'void Pipe::register_pipe()' thread 7fe85bfe7700 time 2012-12-22 21:37:30.361441 msg/Pipe.cc: 1039: FAILED assert(msgr->rank_pipe.count(peer_addr) == 0) ceph version 0.55.1-352-gbf98c06 (bf98c062aa3a688f7caa7dea25b15ef1f9c14ad3) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x11a29ec] 2: (Pipe::register_pipe()+0x194) [0x127fb42] 3: (SimpleMessenger::connect_rank(entity_addr_t const&, int, Connection*, Message*)+0x2de) [0x11844c2] 4: (SimpleMessenger::submit_message(Message*, Connection*, entity_addr_t const&, int, bool)+0xae7) [0x1185533] 5: (SimpleMessenger::_send_message(Message*, Connection*, bool)+0x2e1) [0x1182a87] 6: (SimpleMessenger::send_message(Message*, Connection*)+0x2d) [0x1189041] 7: (OSD::heartbeat()+0x7f5) [0xea16c5] 8: (OSD::heartbeat_entry()+0x41) [0xea03c1] 9: (OSD::T_Heartbeat::entry()+0x1c) [0xed282c] 10: (Thread::_entry_func(void*)+0x23) [0x119113d] 11: (()+0x6b50) [0x7fe8772cdb50] 12: (clone()+0x6d) [0x7fe8759506dd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Sage Weil over 11 years ago
- Status changed from Fix Under Review to Resolved
- Backport set to bobtail
Actions