Bug #15368
closed"api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure"
0%
Description
Run: http://pulpito.ceph.com/teuthology-2016-04-03_22:00:01-rados-jewel-distro-basic-smithi/
Job: 106350
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-04-03_22:00:01-rados-jewel-distro-basic-smithi/106350/teuthology.log
016-04-03T23:58:37.937 INFO:tasks.workunit.client.0.smithi030.stdout: api_misc: test/librados/misc.cc:71: Failure 2016-04-03T23:58:37.937 INFO:tasks.workunit.client.0.smithi030.stdout: api_misc: Expected: (0) != (rados_connect(cluster)), actual: 0 vs 0 2016-04-03T23:58:37.938 INFO:tasks.workunit.client.0.smithi030.stdout: api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure (16 ms) 2016-04-03T23:58:37.938 INFO:tasks.workunit.client.0.smithi030.stdout: api_misc: [----------] 1 test from LibRadosMiscConnectFailure (16 ms total)
Updated by Samuel Just about 8 years ago
370e4f773a5347a2d0c0493ccf3dc55909b75bce
Updated by Samuel Just about 8 years ago
- Assignee set to Samuel Just
The bug seems to be that in MonClient::authenticate, it's entirely possible that the connection is established between the timeout and when the thread is rescheduled. Two options in that case:
1) disconnect and return the error
2) return success
2) seems cleaner, but we'll have to modify the test a bit to use ms_inject_delay* to ensure that the authentication happens more slowly than the timeout. Annoyingly, async messenger does not yet support that. I think I'll implement 2 and force that test to use simple messenger.
Updated by Samuel Just about 8 years ago
Hmm, that doesn't really help, we still need a way to cancel the in progress authentication...
Updated by Samuel Just about 8 years ago
Neither init() nor shutdown() seems to clear the authentication state.
Updated by David Zafman about 8 years ago
- Related to Bug #15477: "failed (workunit test osdc/stress_objectcacher.sh)" in rados-hammer-distro-basic-vps/ added
Updated by Samuel Just almost 8 years ago
- Related to Feature #16091: Monclient: hunt for mons in parallel added
Updated by Yuri Weinstein almost 8 years ago
- Release set to master
2016-06-26T10:26:55.932 INFO:tasks.workunit.client.0.smithi002.stdout: api_misc: Expected: (0) != (rados_connect(cluster)), actual: 0 vs 0 2016-06-26T10:26:55.932 INFO:tasks.workunit.client.0.smithi002.stdout: api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure (14 ms) 2016-06-26T10:26:55.932 INFO:tasks.workunit.client.0.smithi002.stdout: api_misc: 2016-06-26 17:26:54.089527 7f3edeffd700 1 -- 172.21.15.2:0/1573833271 >> 172.21.15.2:6790/0 conn(0x7f3f02ac3a00 sd=17 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=22 cs=1 l=1). == rx == mon.2 seq 6 0x7f3ed8000b90 osd_map(15..15 src has 1..15) v32016-06-26T10:26:55.932 INFO:tasks.workunit.client.0.smithi002.stdout: api_misc: Expected: (0) != (rados_connect(cluster)), actual: 0 vs 0
2016-06-26T10:26:55.932 INFO:tasks.workunit.client.0.smithi002.stdout: api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure (14 ms)
2016-06-26T10:26:55.932 INFO:tasks.workunit.client.0.smithi002.stdout: api_misc: 2016-06-26 17:26:54.089527 7f3edeffd700 1 -- 172.21.15.2:0/1573833271 >> 172.21.15.2:6790/0 conn(0x7f3f02ac3a00 sd=17 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=22 cs=1 l=1). rx mon.2 seq 6 0x7f3ed8000b90 osd_map(15..15 src has 1..15) v3
Updated by Yuri Weinstein over 7 years ago
2016-09-01T22:33:33.448 INFO:tasks.workunit.client.0.smithi026.stdout: api_misc: test/librados/misc.cc:70: Failure 2016-09-01T22:33:33.448 INFO:tasks.workunit.client.0.smithi026.stdout: api_misc: Expected: (0) != (rados_connect(cluster)), actual: 0 vs 0 2016-09-01T22:33:33.448 INFO:tasks.workunit.client.0.smithi026.stdout: api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure (28 ms)
Updated by Yuri Weinstein over 7 years ago
on master branch.
Run: http://pulpito.ceph.com/teuthology-2016-10-30_04:20:07-upgrade:jewel-x-master-distro-basic-vps/
Job: 502767
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-10-30_04:20:07-upgrade:jewel-x-master-distro-basic-vps/502767/teuthology.log
2016-10-30T05:26:17.244 INFO:tasks.workunit.client.0.vpm073.stdout: api_misc: 2016-10-30 05:26:17.085673 7fac68ba12c0 10 monclient: _send_mon_message to mon.c at 172.21.2.73:6790/0 2016-10-30T05:26:17.244 INFO:tasks.workunit.client.0.vpm073.stdout: api_misc: 2016-10-30 05:26:17.085677 7fac68ba12c0 1 -- 172.21.2.73:0/2458744259 --> 172.21.2.73:6790/0 -- mon_subscribe({mgrmap=0+}) v2 -- 0x7fac72034c00 con 0 2016-10-30T05:26:17.245 INFO:tasks.workunit.client.0.vpm073.stdout: api_misc: /srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-11.0.2-791-g5354e7c/src/test/librados/misc.cc:70: Failure 2016-10-30T05:26:17.245 INFO:tasks.workunit.client.0.vpm073.stdout: api_misc: Expected: (0) != (rados_connect(cluster)), actual: 0 vs 0 2016-10-30T05:26:17.245 INFO:tasks.workunit.client.0.vpm073.stdout: api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure (236 ms)
Updated by Sage Weil over 7 years ago
- Subject changed from "api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure" in rados-jewel-distro-basic-smithi to "api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure"
- Priority changed from Normal to Urgent
/a/sage-2016-11-29_20:05:25-rados:thrash-master---basic-smithi/586352
Updated by Sage Weil over 7 years ago
https://github.com/ceph/ceph/pull/11128 should help
Updated by Nathan Cutler over 7 years ago
Sage, are you saying that http://tracker.ceph.com/issues/16091 should be backported to jewel? (The description of this bug indicates that the failure also happens in jewel rados suites)
Updated by Yuri Weinstein over 7 years ago
on master in smoke suite:
2016-12-13T06:21:28.184 INFO:tasks.workunit.client.0.vpm139.stdout: api_misc: 2016-12-13 06:21:23.174238 7f169effd700 1 -- 172.21.2.139:0/3078865336 <== mon.1 172.21.2.171:6789/0 4 ==== osd_map(38..38 src has 1..38) v3 ==== 2532+0+0 (2506938689 0 0) 0x7f168c0027b0 con 0x55d0509451f0 2016-12-13T06:21:28.185 INFO:tasks.workunit.client.0.vpm139.stdout: api_misc: /build/ceph-11.0.2-2422-ga3bf341/src/test/librados/misc.cc:70: Failure 2016-12-13T06:21:28.185 INFO:tasks.workunit.client.0.vpm139.stdout: api_misc: Expected: (0) != (rados_connect(cluster)), actual: 0 vs 0 2016-12-13T06:21:28.185 INFO:tasks.workunit.client.0.vpm139.stdout: api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure
Updated by Sage Weil about 7 years ago
- Status changed from New to Fix Under Review
Updated by Sage Weil about 7 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to kraken,jewel
Updated by Nathan Cutler about 7 years ago
- Copied to Backport #19561: kraken: "api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure" added
Updated by Nathan Cutler about 7 years ago
- Copied to Backport #19562: jewel: "api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure" added
Updated by Nathan Cutler almost 7 years ago
- Status changed from Pending Backport to Resolved