Project

General

Profile

Actions

Bug #49592

closed

"test_notify.py" is timing out in upgrade-clients:client-upgrade-nautilus-pacific-pacific

Added by Yuri Weinstein about 3 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
nautilus, octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: https://pulpito.ceph.com/teuthology-2021-03-03_17:23:34-upgrade-clients:client-upgrade-nautilus-pacific-pacific-distro-basic-gibba/
Jobs: all
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2021-03-03_17:23:34-upgrade-clients:client-upgrade-nautilus-pacific-pacific-distro-basic-gibba/5930144/teuthology.log

2021-03-03T17:41:59.203 DEBUG:teuthology.orchestra.remote:gibba044:/home/ubuntu/cephtest/workunits.list.client.1 is 4KB
2021-03-03T17:41:59.257 INFO:tasks.workunit:Running workunits matching rbd/notify_slave.sh on client.1...
2021-03-03T17:41:59.257 INFO:tasks.workunit:Running workunit rbd/notify_slave.sh...
2021-03-03T17:41:59.258 DEBUG:teuthology.orchestra.run.gibba044:workunit test rbd/notify_slave.sh> mkdir -p -- /home/ubuntu/cephtest/mnt.1/client.1/tmp && cd -- /home/ubuntu/cephtest/mnt.1/client.1/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=octopus TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="1" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.1 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.1 RBD_FEATURES=61 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.1/qa/workunits/rbd/notify_slave.sh
2021-03-03T17:41:59.275 INFO:tasks.workunit.client.1.gibba044.stderr:+ dirname /home/ubuntu/cephtest/clone.client.1/qa/workunits/rbd/notify_slave.sh
2021-03-03T17:41:59.276 INFO:tasks.workunit.client.1.gibba044.stderr:+ relpath=/home/ubuntu/cephtest/clone.client.1/qa/workunits/rbd/../../../src/test/librbd
2021-03-03T17:41:59.276 INFO:tasks.workunit.client.1.gibba044.stderr:+ python3 /home/ubuntu/cephtest/clone.client.1/qa/workunits/rbd/../../../src/test/librbd/test_notify.py slave
2021-03-03T20:41:57.460 DEBUG:teuthology.orchestra.run:got remote process result: 124
2021-03-03T20:41:57.498 INFO:tasks.workunit:Stopping ['rbd/notify_master.sh'] on client.0...
2021-03-03T20:41:57.499 DEBUG:teuthology.orchestra.run.gibba041:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2021-03-03T20:41:57.628 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/run_tasks.py", line 91, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/run_tasks.py", line 70, in run_one_task
    return task(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/task/parallel.py", line 56, in task
    p.spawn(_run_spawned, ctx, confg, taskname)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/parallel.py", line 84, in __exit__
    for result in self:
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/parallel.py", line 98, in __next__
    resurrect_traceback(result)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/parallel.py", line 30, in resurrect_traceback
    raise exc.exc_info[1]
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/parallel.py", line 23, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/task/parallel.py", line 64, in _run_spawned
    mgr = run_tasks.run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/run_tasks.py", line 70, in run_one_task
    return task(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_ceph_e713a2fcf55395640035401674cfa3db7100dbbe/qa/tasks/workunit.py", line 125, in task
    timeout=timeout,cleanup=cleanup)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/parallel.py", line 84, in __exit__
    for result in self:
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/parallel.py", line 98, in __next__
    resurrect_traceback(result)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/parallel.py", line 30, in resurrect_traceback
    raise exc.exc_info[1]
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/parallel.py", line 23, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_ceph_e713a2fcf55395640035401674cfa3db7100dbbe/qa/tasks/workunit.py", line 411, in _run_tests
    label="workunit test {workunit}".format(workunit=workunit)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/orchestra/remote.py", line 215, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/orchestra/run.py", line 455, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/orchestra/run.py", line 161, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_3082387bba74fcd24c9700593d10418152d53c97/teuthology/orchestra/run.py", line 183, in _raise_for_status
    node=self.hostname, label=self.label
teuthology.exceptions.CommandFailedError: Command failed (workunit test rbd/notify_master.sh) on gibba041 with status 124: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=nautilus TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 RBD_FEATURES=61 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/rbd/notify_master.sh'

Related issues 2 (0 open2 closed)

Copied to rbd - Backport #50995: nautilus: "test_notify.py" is timing out in upgrade-clients:client-upgrade-nautilus-pacific-pacificResolvedDeepika UpadhyayActions
Copied to rbd - Backport #50996: octopus: "test_notify.py" is timing out in upgrade-clients:client-upgrade-nautilus-pacific-pacificResolvedDeepika UpadhyayActions
Actions #3

Updated by Deepika Upadhyay almost 3 years ago

021-04-30T07:27:58.635+0000 7f564caa0700  5 librbd::Watcher: 0x7f5628013830 notifications_blocked: blocked=0
2021-04-30T07:27:58.635+0000 7f564caa0700 10 librbd::Watcher::C_NotifyAck 0x7f562805ac90 C_NotifyAck: id=77309411337, handle=140008015411360
2021-04-30T07:27:58.635+0000 7f564caa0700 20 librbd::ImageWatcher: 0x7f5628013830 request progress: [4267,140008015411360,2] @ 2/16
2021-04-30T07:27:58.635+0000 7f564caa0700 20 librbd::ImageWatcher: scheduling async request time out: [4267,140008015411360,2]
2021-04-30T07:27:58.635+0000 7f564caa0700 10 librbd::Watcher::C_NotifyAck 0x7f562805ac90 finish: r=0
2021-04-30T07:27:58.635+0000 7f564caa0700 10 librbd::Watcher: 0x7f5628013830 acknowledge_notify: acknowledge_notify id=77309411337, handle=140008015411360
2021-04-30T07:27:58.635+0000 7f564caa0700  1 -- 172.21.15.73:0/1676283437 --> [v2:172.21.15.75:6800/14774,v1:172.21.15.75:6801/14774] -- osd_op(unknown.0.0:1056 2.7 2:ff185ea2:::rbd_header.10e6dc46fc19:head [notify-ack in=20b] snapc 0=[] ondisk+read+known_if_redirected e18) v8 -- 0x7f5628065680 con 0x7f5628004490
2021-04-30T07:27:58.635+0000 7f564eaa4700  1 -- 172.21.15.73:0/1676283437 <== osd.1 v2:172.21.15.75:6800/14774 602 ==== osd_op_reply(1055 rbd_header.10e6dc46fc19 [notify-ack] v0'0 uv10 ondisk = 0) v8 ==== 167+0+0 (crc 0 0 0) 0x7f563808f040 con 0x7f5628004490
2021-04-30T07:27:58.635+0000 7f564eaa4700  1 -- 172.21.15.73:0/1676283437 <== osd.1 v2:172.21.15.75:6800/14774 603 ==== osd_op_reply(1056 rbd_header.10e6dc46fc19 [notify-ack] v0'0 uv44 ondisk = 0) v8 ==== 167+0+0 (crc 0 0 0) 0x7f563808f040 con 0x7f5628004490
2021-04-30T07:27:58.635+0000 7f564eaa4700  1 -- 172.21.15.73:0/1676283437 <== osd.1 v2:172.21.15.75:6800/14774 604 ==== watch-notify(notify (1) cookie 140008015411360 notify 77309411338 ret 0) v3 ==== 52+0+0 (crc 0 0 0) 0x7f5638050a30 con 0x7f5628004490
2021-04-30T07:27:58.635+0000 7f564d2a1700  5 librbd::Watcher: 0x7f5628013830 notifications_blocked: blocked=0
2021-04-30T07:27:58.635+0000 7f564d2a1700 10 librbd::Watcher::C_NotifyAck 0x15426d0 C_NotifyAck: id=77309411338, handle=140008015411360
2021-04-30T07:27:58.635+0000 7f564d2a1700 10 librbd::ImageWatcher: 0x7f5628013830 image header updated
2021-04-30T07:27:58.635+0000 7f564d2a1700 20 librbd::ImageState: 0x7f5628021480 handle_update_notification: refresh_seq = 1, last_refresh = 0
2021-04-30T07:27:58.635+0000 7f564d2a1700 20 librbd::ImageState: 0x1530540 ImageUpdateWatchers::notify
2021-04-30T07:27:58.635+0000 7f564d2a1700 20 librbd::ImageState: 0x7f5628021480 flush_update_watchers
2021-04-30T07:27:58.635+0000 7f564d2a1700 20 librbd::ImageState: 0x1530540 ImageUpdateWatchers::flush
2021-04-30T07:27:58.635+0000 7f564d2a1700 20 librbd::ImageState: 0x1530540 ImageUpdateWatchers::flush: completing flush
2021-04-30T07:27:58.635+0000 7f564d2a1700 10 librbd::ImageWatcher: 0x7f5638007d30 C_ResponseMessage: r=0
**2021-04-30T07:27:58.635+0000 7f564d2a1700 10 librbd::Watcher::C_NotifyAck 0x15426d0 finish: r=0**
**2021-04-30T07:27:58.635+0000 7f564d2a1700 10 librbd::Watcher: 0x7f5628013830 acknowledge_notify: acknowledge_notify id=77309411338, handle=140008015411360**
2021-04-30T07:27:58.635+0000 7f564d2a1700  1 -- 172.21.15.73:0/1676283437 --> [v2:172.21.15.75:6800/14774,v1:172.21.15.75:6801/14774] -- osd_op(unknown.0.0:1057 2.7 2:ff185ea2:::rbd_header.10e6dc46fc19:head [notify-ack in=30b] snapc 0=[] ondisk+read+known_if_redirected e18) v8 -- 0x7f5634029ad0 con 0x7f5628004490
2021-04-30T07:27:58.635+0000 7f564eaa4700  1 -- 172.21.15.73:0/1676283437 <== osd.1 v2:172.21.15.75:6800/14774 605 ==== osd_op_reply(1057 rbd_header.10e6dc46fc19 [notify-ack] v0'0 uv44 ondisk = 0) v8 ==== 167+0+0 (crc 0 0 0) 0x7f563808f040 con 0x7f5628004490
2021-04-30T07:27:58.635+0000 7f564eaa4700  1 -- 172.21.15.73:0/1676283437 <== osd.1 v2:172.21.15.75:6800/14774 606 ==== watch-notify(notify (1) cookie 140008015411360 notify 77309411339 ret 0) v3 ==== 80+0+0 (crc 0 0 0) 0x7f5638050bc0 con 0x7f5628004490
2021-04-30T07:27:58.639+0000 7f564caa0700  5 librbd::Watcher: 0x7f5628013830 notifications_blocked: blocked=0
2021-04-30T07:27:58.639+0000 7f564caa0700 10 librbd::Watcher::C_NotifyAck 0x7f5628063350 C_NotifyAck: id=77309411339, handle=140008015411360
2021-04-30T07:27:58.639+0000 7f564caa0700 20 librbd::ImageWatcher: remove_async_request[4267,140008015411360,2]
2021-04-30T07:27:58.639+0000 7f564caa0700 10 librbd::ImageWatcher: 0x7f5628013830 request finished: [4267,140008015411360,2]=0
2021-04-30T07:27:58.639+0000 7f564caa0700 10 librbd::ImageWatcher: 0x7f5628013830 async request: [4267,140008015411360,2] completed
2021-04-30T07:27:58.639+0000 7f564caa0700 10 librbd::Watcher::C_NotifyAck 0x7f5628063350 finish: r=0
2021-04-30T07:27:58.639+0000 7f564caa0700 10 librbd::Watcher: 0x7f5628013830 acknowledge_notify: acknowledge_notify id=77309411339, handle=140008015411360
2021-04-30T07:27:58.639+0000 7f564d2a1700 20 librbd::Operations: handle_remote_request: r=0
2021-04-30T07:27:58.639+0000 7f564d2a1700 20 librbd::ImageState: 0x7f5628021480 handle_update_notification: refresh_seq = 2, last_refresh = 0
2021-04-30T07:27:58.639+0000 7f564d2a1700 20 librbd::ImageState: 0x1530540 ImageUpdateWatchers::notify
2021-04-30T07:27:58.639+0000 7f5657e91740 20 librbd::Operations: flatten finished

```
2021-04-30T07:28:18.816+0000 7febe2ffd700 10 librbd::ImageWatcher: 0x7febcc01b980 C_ResponseMessage: r=0
**2021-04-30T07:28:18.816+0000 7febe2ffd700 10 librbd::Watcher::C_NotifyAck 0x7febc40045b0 finish: r=0**
2021-04-30T07:28:18.816+0000 7febe2ffd700  1 -- 172.21.15.75:0/1095735243 --> [v2:172.21.15.75:6800/14774,v1:172.21.15.75:6801/14774] -- osd_op(unknown.0.0:309 2.7 2:ff185ea2:::rbd_header.10e6dc46fc19:head [notify-ack in=30b] snapc 0=[] ondisk+read+known_if_redirected e21) v8 -- 0x7febd40165a0 con 0x12c9540
2021-04-30T07:28:18.816+0000 7febfd7ec700  1 -- 172.21.15.75:0/1095735243 <== osd.1 v2:172.21.15.75:6800/14774 318 ==== osd_op_reply(309 rbd_header.10e6dc46fc19 [notify-ack] v0'0 uv69 ondisk = 0) v8 ==== 167+0+0 (crc 0 0 0) 0x7febf4045040 con 0x12c9540
2021-04-30T07:28:21.096+0000 7febfafe7700  1 -- 172.21.15.75:0/1095735243 --> [v2:172.21.15.75:6800/14774,v1:172.21.15.75:6801/14774] -- osd_op(unknown.0.0:310 2.7 2:ff185ea2:::rbd_header.10e6dc46fc19:head [watch ping cookie 140650850861152] snapc 0=[] ondisk+write+known_if_redirected e21) v8 -- 0x7febbc001040 con 0x12c9540
2021-04-30T07:28:21.096+0000 7febfafe7700  1 -- 172.21.15.75:0/1095735243 --> [v2:172.21.15.75:3300/0,v1:172.21.15.75:6789/0] -- mon_subscribe({osdmap=22}) v3 -- 0x7febbc0041b0 con 0x11d3870
2021-04-30T07:28:21.096+0000 7febfafe7700  1 -- 172.21.15.75:0/1095735243 --> [v2:172.21.15.75:6800/14774,v1:172.21.15.75:6801/14774] -- ping magic: 0 v1 -- 0x7febbc003a70 con 0x12c9540
2021-04-30T07:28:21.096+0000 7febfd7ec700  1 -- 172.21.15.75:0/1095735243 <== osd.1 v2:172.21.15.75:6800/14774 319 ==== osd_op_reply(310 rbd_header.10e6dc46fc19 [watch ping cookie 140650850861152] v0'0 uv69 ondisk = 0) v8 ==== 167+0+0 (crc 0 0 0) 0x7febf4045040 con 0x12c9540
2021-04-30T07:27:58.679+0000 7f564caa0700 10 librbd::Watcher::C_NotifyAck 0x7f56340324f0 finish: r=0
2021-04-30T07:27:58.679+0000 7f564caa0700 10 librbd::Watcher: 0x7f5628013830 acknowledge_notify: acknowledge_notify id=77309411346, handle=140008015411360
2021-04-30T07:27:58.679+0000 7f564caa0700  1 -- 172.21.15.73:0/1676283437 --> [v2:172.21.15.75:6800/14774,v1:172.21.15.75:6801/14774] -- osd_op(unknown.0.0:1075 2.7 2:ff185ea2:::rbd_header.10e6dc46fc19:head [notify-ack in=30b] snapc 0=[] ondisk+read+known_if_redirected e18) v8 -- 0x7f562802b990 con 0x7f5628004490
2021-04-30T07:27:58.679+0000 7f564eaa4700  1 -- 172.21.15.73:0/1676283437 <== osd.1 v2:172.21.15.75:6800/14774 631 ==== osd_op_reply(1075 rbd_header.10e6dc46fc19 [notify-ack] v0'0 uv48 ondisk = 0) v8 ==== 167+0+0 (crc 0 0 0) 0x7f563808f040 con 0x7f5628004490
2021-04-30T07:27:58.679+0000 7f564eaa4700  1 -- 172.21.15.73:0/1676283437 <== osd.1 v2:172.21.15.75:6800/14774 632 ==== watch-notify(notify (1) cookie 140008015411360 notify 77309411347 ret 0) v3 ==== 80+0+0 (crc 0 0 0) 0x7f563803f9b0 con 0x7f5628004490
**2021-04-30T07:27:58.679+0000 7f564d2a1700  5 librbd::Watcher: 0x7f5628013830 notifications_blocked: blocked=0
2021-04-30T07:27:58.679+0000 7f564d2a1700 10 librbd::Watcher::C_NotifyAck 0x1530690 C_NotifyAck: id=77309411347, handle=140008015411360**
2021-04-30T07:27:58.679+0000 7f564d2a1700 20 librbd::ImageWatcher: remove_async_request[4267,140008015411360,3]
2021-04-30T07:27:58.679+0000 7f564d2a1700 10 librbd::ImageWatcher: 0x7f5628013830 request finished: [4267,140008015411360,3]=0
2021-04-30T07:27:58.679+0000 7f564d2a1700 10 librbd::ImageWatcher: 0x7f5628013830 async request: [4267,140008015411360,3] completed
2021-04-30T07:27:58.679+0000 7f564d2a1700 10 librbd::Watcher::C_NotifyAck 0x1530690 finish: r=0
2021-04-30T07:27:58.679+0000 7f564d2a1700 10 librbd::Watcher: 0x7f5628013830 acknowledge_notify: acknowledge_notify id=77309411347, handle=140008015411360
2021-04-30T07:27:58.679+0000 7f564caa0700 20 librbd::Operations: handle_remote_request: r=0
2021-04-30T07:27:58.679+0000 7f564caa0700 20 librbd::ImageState: 0x7f5628021480 handle_update_notification: refresh_seq = 4, last_refresh = 2
2021-04-30T07:27:58.679+0000 7f564caa0700 20 librbd::ImageState: 0x1530540 ImageUpdateWatchers::notify
2021-04-30T07:27:58.679+0000 7f564d2a1700  1 -- 172.21.15.73:0/1676283437 --> [v2:172.21.15.75:6800/14774,v1:172.21.15.75:6801/14774] -- osd_op(unknown.0.0:1076 2.7 2:ff185ea2:::rbd_header.10e6dc46fc19:head [notify-ack in=20b] snapc 0=[] ondisk+read+known_if_redirected e18) v8 -- 0x7f5634029ad0 con 0x7f5628004490
2021-04-30T07:27:58.679+0000 7f5657e91740  2 librbd::Operations: resize finished

call function:

void NotifyLockOwner::finish(int r) {
  CephContext *cct = m_image_ctx.cct;
  ldout(cct, 10) << " r=" << r << dendl;
  m_on_finish->complete(r);
  delete this;
}

Actions #4

Updated by Greg Farnum almost 3 years ago

  • Project changed from Ceph to rbd
Actions #5

Updated by Ilya Dryomov almost 3 years ago

  • Assignee set to Deepika Upadhyay
Actions #6

Updated by Deepika Upadhyay almost 3 years ago

  • Pull request ID set to 41514
Actions #7

Updated by Deepika Upadhyay almost 3 years ago

  • Status changed from New to Pending Backport
Actions #8

Updated by Deepika Upadhyay almost 3 years ago

  • Backport set to nautilus, octopus
Actions #9

Updated by Deepika Upadhyay almost 3 years ago

  • Copied to Backport #50995: nautilus: "test_notify.py" is timing out in upgrade-clients:client-upgrade-nautilus-pacific-pacific added
Actions #10

Updated by Deepika Upadhyay almost 3 years ago

  • Copied to Backport #50996: octopus: "test_notify.py" is timing out in upgrade-clients:client-upgrade-nautilus-pacific-pacific added
Actions #11

Updated by Ilya Dryomov about 2 years ago

pacific (no backport ticket): https://github.com/ceph/ceph/pull/41629

Actions #12

Updated by Ilya Dryomov about 2 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF