Bug #51307
closed
LibRadosWatchNotify.Watch2Delete fails
Added by Sage Weil almost 3 years ago.
Updated about 2 years ago.
Description
2021-06-20T20:11:53.203 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: [ RUN ] LibRadosWatchNotify.Watch2Delete
2021-06-20T20:11:53.204 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: waiting up to 300 for disconnect notification ...
2021-06-20T20:11:53.204 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: watch_notify2_test_errcb cookie 94538226300544 err -107
2021-06-20T20:11:53.204 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-5376-ge1b58156/rpm/el8/BUILD/ceph-17.0.0-5376-ge1b58156/src/test/librados/watch_notify.cc:165: Failure
2021-06-20T20:11:53.204 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: Expected equality of these values:
2021-06-20T20:11:53.204 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: -107
2021-06-20T20:11:53.205 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: rados_watch_check(ioctx, handle)
2021-06-20T20:11:53.205 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: Which is: -2
2021-06-20T20:11:53.205 INFO:tasks.workunit.client.0.smithi079.stdout: api_watch_notify: [ FAILED ] LibRadosWatchNotify.Watch2Delete (69145 ms)
/a/sage-2021-06-20_15:30:30-rados-wip-sage-testing-2021-06-20-1007-distro-basic-smithi/6181813
- Related to Bug #50042: rados/test.sh: api_watch_notify failures added
/a/yuriw-2021-07-27_17:19:39-rados-wip-yuri-testing-2021-07-27-0830-pacific-distro-basic-smithi/6297201
/a/yuriw-2021-12-03_15:27:18-rados-wip-yuri11-testing-2021-12-02-1451-distro-default-smithi/6542889
2021-12-03T19:18:27.907 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: [ RUN ] LibRadosWatchNotify.Watch2Delete
2021-12-03T19:18:27.907 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: waiting up to 300 for disconnect notification ...
2021-12-03T19:18:27.907 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: watch_notify2_test_errcb cookie 94744431035984 err -107
2021-12-03T19:18:27.907 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-9426-ge65eeb2b/rpm/el8/BUILD/ceph-17.0.0-9426-ge65eeb2b/src/test/librados/watch_notify.cc:165: Failure
2021-12-03T19:18:27.908 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: Expected equality of these values:
2021-12-03T19:18:27.908 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: -107
2021-12-03T19:18:27.908 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: rados_watch_check(ioctx, handle)
2021-12-03T19:18:27.908 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: Which is: -2
2021-12-03T19:18:27.908 INFO:tasks.workunit.client.0.smithi058.stdout: api_watch_notify: [ FAILED ] LibRadosWatchNotify.Watch2Delete (143755 ms)
/a/yuriw-2022-02-16_00:25:26-rados-wip-yuri-testing-2022-02-15-1431-distro-default-smithi/6687338
Laura Flores wrote:
/a/yuriw-2022-02-16_00:25:26-rados-wip-yuri-testing-2022-02-15-1431-distro-default-smithi/6687338
i think we need to have the following change in watch_notify.cc
int rados_watch_check_err = rados_watch_check(ioctx, handle);
// We may hit ENOENT due to socket failure injection and a forced reconnect
EXPECT_TRUE(rados_watch_check_err -ENOTCONN || rados_watch_check_err -ENOENT)
i couldn't find any socket failure injection, but reconnect is present around the time of the error
- Status changed from New to In Progress
- Assignee set to Nitzan Mordechai
In that case it was not injection socket failure, it was:
2022-02-16T09:56:22.598+0000 15af4700 1 -- [v2:172.21.15.124:3300/0,v1:172.21.15.124:6789/0] >> conn(0x1aa6a930 msgr2=0x18b77000 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0)._try_send send error: (32) Broken pipe
that will cause the reconnect and return of -102
I'll add my suggestion to the code
- Status changed from In Progress to Fix Under Review
- Status changed from Fix Under Review to Pending Backport
- Backport set to quincy
- Copied to Backport #55021: quincy: LibRadosWatchNotify.Watch2Delete fails added
- Pull request ID set to 45366
- Status changed from Pending Backport to Resolved
Also available in: Atom
PDF