Project

General

Profile

Actions

Bug #45721

closed

CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.test

Added by Brad Hubbard almost 4 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

100%

Source:
Q/A
Tags:
backport_processed
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi/5088170

2020-05-24T22:42:37.494 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.TestWatchNotify.test ... FAIL
2020-05-24T22:42:37.495 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.495 INFO:tasks.workunit.client.0.smithi082.stderr:  File "rados.pyx", line 800, in rados.Rados.require_state
2020-05-24T22:42:37.495 INFO:tasks.workunit.client.0.smithi082.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
2020-05-24T22:42:37.495 INFO:tasks.workunit.client.0.smithi082.stderr:Exception ignored in: 'rados.Watch.__dealloc__'
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:  File "rados.pyx", line 800, in rados.Rados.require_state
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:  File "rados.pyx", line 800, in rados.Rados.require_state
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:Exception ignored in: 'rados.Watch.__dealloc__'
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:  File "rados.pyx", line 800, in rados.Rados.require_state
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
2020-05-24T22:42:37.504 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_rados_init_error ... ok
2020-05-24T22:42:37.583 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_rados_init ... ok
2020-05-24T22:42:37.607 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_ioctx_context_manager ... ok
2020-05-24T22:42:37.618 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_parse_argv ... ok
2020-05-24T22:42:37.623 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_parse_argv_empty_str ... ok
2020-05-24T22:42:37.624 INFO:tasks.workunit.client.0.smithi082.stderr:
2020-05-24T22:42:37.624 INFO:tasks.workunit.client.0.smithi082.stderr:======================================================================
2020-05-24T22:42:37.624 INFO:tasks.workunit.client.0.smithi082.stderr:FAIL: test_rados.TestWatchNotify.test
2020-05-24T22:42:37.624 INFO:tasks.workunit.client.0.smithi082.stderr:----------------------------------------------------------------------
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:    self.test(*self.arg)
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1290, in test
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:    eq(self.notify_cnt[watch_id1], 3)
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:AssertionError: 4 != 3
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:----------------------------------------------------------------------
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:Ran 81 tests in 293.733s
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:FAILED (failures=1)
2020-05-24T22:42:37.645 DEBUG:teuthology.orchestra.run:got remote process result: 1
2020-05-24T22:42:37.646 INFO:tasks.workunit:Stopping ['rados/test_python.sh'] on client.0...
2020-05-24T22:42:37.646 INFO:teuthology.orchestra.run.smithi082:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2020-05-24T22:42:37.893 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/run_tasks.py", line 87, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/run_tasks.py", line 66, in run_one_task
    return task(**kwargs)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-yuri-master_5.24.20/qa/tasks/workunit.py", line 127, in task
    coverage_and_limits=not config.get('no_coverage_and_limits', None))
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/parallel.py", line 87, in __exit__
    for result in self:
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/parallel.py", line 101, in __next__
    resurrect_traceback(result)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/parallel.py", line 37, in resurrect_traceback
    reraise(*exc_info)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/parallel.py", line 24, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-yuri-master_5.24.20/qa/tasks/workunit.py", line 415, in _run_tests
    label="workunit test {workunit}".format(workunit=workunit)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/orchestra/remote.py", line 206, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/orchestra/run.py", line 473, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/orchestra/run.py", line 162, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/orchestra/run.py", line 184, in _raise_for_status
    node=self.hostname, label=self.label
CommandFailedError: Command failed (workunit test rados/test_python.sh) on smithi082 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CE
PH_REF=9dce54cc5473ddca3d64786e557f4b0c097deed7 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adj
ust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test_python.sh'

Related issues 4 (1 open3 closed)

Related to RADOS - Bug #52553: pybind: rados.RadosStateError raised when closed watch object goes out of scope after cluster shutdownResolvedTim Serong

Actions
Related to RADOS - Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados.TestWatchNotify.test_aio_notify to failPending BackportRadoslaw Zarzynski

Actions
Copied to RADOS - Backport #57544: pacific: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.testResolvedNitzan MordechaiActions
Copied to RADOS - Backport #57545: quincy: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.testResolvedNitzan MordechaiActions
Actions #1

Updated by Neha Ojha almost 4 years ago

/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117777

Actions #2

Updated by Kefu Chai almost 4 years ago

2020-06-23T23:35:22.984 INFO:tasks.workunit.client.0.smithi107.stderr:======================================================================
2020-06-23T23:35:22.984 INFO:tasks.workunit.client.0.smithi107.stderr:FAIL: test_rados.TestWatchNotify.test
2020-06-23T23:35:22.984 INFO:tasks.workunit.client.0.smithi107.stderr:----------------------------------------------------------------------
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:Traceback (most recent call last):
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:    self.test(*self.arg)
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1281, in test
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:    eq(self.notify_cnt[watch_id1], 2)
2020-06-23T23:35:22.986 INFO:tasks.workunit.client.0.smithi107.stderr:AssertionError: 3 != 2

/a/kchai-2020-06-23_15:15:59-rados-wip-kefu-testing-2020-06-23-1353-distro-basic-smithi/5173648/

Actions #3

Updated by Kefu Chai almost 4 years ago

2020-06-27T08:17:29.361 INFO:tasks.workunit.client.0.smithi203.stderr:FAIL: test_rados.TestWatchNotify.test
2020-06-27T08:17:29.361 INFO:tasks.workunit.client.0.smithi203.stderr:----------------------------------------------------------------------
2020-06-27T08:17:29.362 INFO:tasks.workunit.client.0.smithi203.stderr:Traceback (most recent call last):
2020-06-27T08:17:29.362 INFO:tasks.workunit.client.0.smithi203.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2020-06-27T08:17:29.362 INFO:tasks.workunit.client.0.smithi203.stderr:    self.test(*self.arg)
2020-06-27T08:17:29.363 INFO:tasks.workunit.client.0.smithi203.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1281, in test
2020-06-27T08:17:29.363 INFO:tasks.workunit.client.0.smithi203.stderr:    eq(self.notify_cnt[watch_id1], 2)
2020-06-27T08:17:29.363 INFO:tasks.workunit.client.0.smithi203.stderr:AssertionError: 3 != 2

/a//kchai-2020-06-27_07:37:00-rados-wip-kefu-testing-2020-06-27-1407-distro-basic-smithi/5183650

Actions #4

Updated by Neha Ojha over 3 years ago

rados/basic/{ceph clusters/{fixed-2 openstack} msgr-failures/many msgr/async-v2only objectstore/bluestore-comp-zstd rados supported-random-distro$/{centos_8} tasks/rados_python}

/a/nojha-2020-07-31_22:50:07-rados-wip-35628-2020-07-28-distro-basic-smithi/5273950

Actions #5

Updated by Kefu Chai over 3 years ago

2020-12-16T21:11:22.044 INFO:tasks.workunit.client.0.smithi167.stderr:FAIL: test_rados.TestWatchNotify.test
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:----------------------------------------------------------------------
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:Traceback (most recent call last):
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:    self.test(*self.arg)
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1348, in test
2020-12-16T21:11:22.046 INFO:tasks.workunit.client.0.smithi167.stderr:    eq(self.notify_cnt[watch_id1], 1)
2020-12-16T21:11:22.046 INFO:tasks.workunit.client.0.smithi167.stderr:AssertionError: 2 != 1

/a/kchai-2020-12-16_15:58:13-rados-wip-kefu-testing-2020-12-16-1740-distro-basic-smithi/5713555

Actions #6

Updated by Neha Ojha about 3 years ago

2021-02-02T02:15:24.737 INFO:tasks.workunit.client.0.smithi036.stderr:======================================================================
2021-02-02T02:15:24.737 INFO:tasks.workunit.client.0.smithi036.stderr:FAIL: test_rados.TestWatchNotify.test
2021-02-02T02:15:24.737 INFO:tasks.workunit.client.0.smithi036.stderr:----------------------------------------------------------------------
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:Traceback (most recent call last):
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:    self.test(*self.arg)
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1406, in test
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:    eq(self.notify_cnt[watch_id1], 3)
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:AssertionError: 5 != 3
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:----------------------------------------------------------------------
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:Ran 88 tests in 313.846s
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:
2021-02-02T02:15:24.740 INFO:tasks.workunit.client.0.smithi036.stderr:FAILED (failures=1)

/a/nojha-2021-02-01_21:31:14-rados-wip-39145-distro-basic-smithi/5846891

Actions #7

Updated by Deepika Upadhyay about 3 years ago

2021-03-11T17:24:39.745 INFO:tasks.workunit.client.0.smithi123.stderr:======================================================================
2021-03-11T17:24:39.746 INFO:tasks.workunit.client.0.smithi123.stderr:FAIL: test_rados.TestWatchNotify.test
2021-03-11T17:24:39.746 INFO:tasks.workunit.client.0.smithi123.stderr:----------------------------------------------------------------------
2021-03-11T17:24:39.746 INFO:tasks.workunit.client.0.smithi123.stderr:Traceback (most recent call last):
2021-03-11T17:24:39.746 INFO:tasks.workunit.client.0.smithi123.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:    self.test(*self.arg)
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1276, in test
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:    eq(self.notify_cnt[watch_id1], 2)
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:AssertionError: 4 != 2
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:
2021-03-11T17:24:39.748 INFO:tasks.workunit.client.0.smithi123.stderr:----------------------------------------------------------------------
2021-03-11T17:24:39.748 INFO:tasks.workunit.client.0.smithi123.stderr:Ran 80 tests in 289.328s

/ceph/teuthology-archive/yuriw-2021-03-09_20:27:38-rados-wip-yuri2-testing-2021-03-09-1006-octopus-distro-basic-smithi/5950464/teuthology.log
/ceph/teuthology-archive/yuriw-2021-03-09_20:28:34-rados-wip-yuri4-testing-2021-03-09-1006-octopus-distro-basic-smithi/5951087/teuthology.log

Actions #8

Updated by Sage Weil about 3 years ago

/a/sage-2021-04-02_14:05:32-rados-wip-sage2-testing-2021-04-01-1417-pacific-distro-basic-smithi/6017345

Actions #9

Updated by Sage Weil about 3 years ago

  • Priority changed from Normal to High
Actions #10

Updated by Kefu Chai about 3 years ago

2021-04-05T12:33:34.093 INFO:tasks.workunit.client.0.gibba010.stderr:======================================================================
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:FAIL: test_rados.TestWatchNotify.test
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:----------------------------------------------------------------------
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:Traceback (most recent call last):
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:    self.test(*self.arg)
2021-04-05T12:33:34.095 INFO:tasks.workunit.client.0.gibba010.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1397, in test
2021-04-05T12:33:34.095 INFO:tasks.workunit.client.0.gibba010.stderr:    eq(self.notify_cnt[watch_id1], 2)
2021-04-05T12:33:34.095 INFO:tasks.workunit.client.0.gibba010.stderr:AssertionError: 3 != 2
2021-04-05T12:33:34.095 INFO:tasks.workunit.client.0.gibba010.stderr:
2021-04-05T12:33:34.096 INFO:tasks.workunit.client.0.gibba010.stderr:----------------------------------------------------------------------

/a/kchai-2021-04-05_10:46:36-rados-wip-kefu-testing-2021-04-05-1650-distro-basic-gibba/6021723/

Actions #11

Updated by Greg Farnum almost 3 years ago

https://pulpito.ceph.com/gregf-2021-04-13_09:22:15-rados-wip-stretch-mon-state-412-distro-basic-smithi/6043691

======================================================================
FAIL: test_rados.TestWatchNotify.test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1406, in test
    eq(self.notify_cnt[watch_id1], 3)
AssertionError: 4 != 3
Actions #12

Updated by Neha Ojha almost 3 years ago

  • Backport set to pacific

/a/yuriw-2021-04-30_12:58:14-rados-wip-yuri2-testing-2021-04-29-1501-pacific-distro-basic-smithi/6086154

Actions #13

Updated by Deepika Upadhyay almost 3 years ago

  • Backport changed from pacific to octopus, pacific
2021-05-14T00:02:12.634 INFO:tasks.workunit.client.0.smithi028.stderr:Exception ignored in: 'rados.Watch.__dealloc__'
2021-05-14T00:02:12.634 INFO:tasks.workunit.client.0.smithi028.stderr:Traceback (most recent call last):
2021-05-14T00:02:12.635 INFO:tasks.workunit.client.0.smithi028.stderr:  File "rados.pyx", line 799, in rados.Rados.require_state
2021-05-14T00:02:12.635 INFO:tasks.workunit.client.0.smithi028.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state s

2021-05-14T00:02:12.729 INFO:tasks.workunit.client.0.smithi028.stderr:======================================================================
2021-05-14T00:02:12.729 INFO:tasks.workunit.client.0.smithi028.stderr:FAIL: test_rados.TestWatchNotify.test
2021-05-14T00:02:12.729 INFO:tasks.workunit.client.0.smithi028.stderr:----------------------------------------------------------------------
2021-05-14T00:02:12.729 INFO:tasks.workunit.client.0.smithi028.stderr:Traceback (most recent call last):
2021-05-14T00:02:12.730 INFO:tasks.workunit.client.0.smithi028.stderr:  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
2021-05-14T00:02:12.730 INFO:tasks.workunit.client.0.smithi028.stderr:    self.test(*self.arg)
2021-05-14T00:02:12.730 INFO:tasks.workunit.client.0.smithi028.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1276, in test
2021-05-14T00:02:12.730 INFO:tasks.workunit.client.0.smithi028.stderr:    eq(self.notify_cnt[watch_id1], 2)
2021-05-14T00:02:12.731 INFO:tasks.workunit.client.0.smithi028.stderr:AssertionError: 3 != 2

/ceph/teuthology-archive/yuriw-2021-05-13_23:05:56-rados-wip-yuri4-testing-2021-05-13-1301-octopus-distro-basic-smithi/6114277/teuthology.log

Actions #14

Updated by Neha Ojha almost 3 years ago

/a/yuriw-2021-05-25_19:21:19-rados-wip-yuri2-testing-2021-05-25-0940-pacific-distro-basic-smithi/6134490

Actions #15

Updated by Neha Ojha almost 3 years ago

/a/yuriw-2021-06-28_17:32:48-rados-wip-yuri2-testing-2021-06-28-0858-pacific-distro-basic-smithi/6239575

Actions #17

Updated by Neha Ojha almost 3 years ago

/a/yuriw-2021-07-02_17:35:44-rados-pacific-distro-basic-smithi/6249971

Actions #19

Updated by Brad Hubbard over 2 years ago

I've modified this test to only run the TestWatchNotify subtests (2) and to generate debug logging. I'll report back what, if anything, I find out.

Actions #20

Updated by Neha Ojha over 2 years ago

/a/yuriw-2021-08-06_16:31:19-rados-wip-yuri-master-8.6.21-distro-basic-smithi/6324539

Actions #21

Updated by Neha Ojha over 2 years ago

/a/yuriw-2021-08-23_19:24:05-rados-wip-yuri4-testing-2021-08-23-0812-pacific-distro-basic-smithi/6353883

Actions #22

Updated by Deepika Upadhyay over 2 years ago

2021-09-17T07:03:13.050 INFO:tasks.workunit.client.0.smithi026.stderr:FAIL: test_rados.TestWatchNotify.test
2021-09-17T07:03:13.051 INFO:tasks.workunit.client.0.smithi026.stderr:----------------------------------------------------------------------
2021-09-17T07:03:13.051 INFO:tasks.workunit.client.0.smithi026.stderr:Traceback (most recent call last):
2021-09-17T07:03:13.051 INFO:tasks.workunit.client.0.smithi026.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2021-09-17T07:03:13.051 INFO:tasks.workunit.client.0.smithi026.stderr:    self.test(*self.arg)
2021-09-17T07:03:13.052 INFO:tasks.workunit.client.0.smithi026.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1285, in test
2021-09-17T07:03:13.052 INFO:tasks.workunit.client.0.smithi026.stderr:    eq(self.notify_cnt[watch_id1], 3)
2021-09-17T07:03:13.052 INFO:tasks.workunit.client.0.smithi026.stderr:AssertionError: 4 != 3

/ceph/teuthology-archive/yuriw-2021-09-16_21:30:19-rados-wip-yuri5-testing-2021-09-16-1056-octopus-distro-basic-smithi/6394002/teuthology.log

Actions #23

Updated by Deepika Upadhyay over 2 years ago

  • Related to Bug #52553: pybind: rados.RadosStateError raised when closed watch object goes out of scope after cluster shutdown added
Actions #24

Updated by Laura Flores over 2 years ago

  • Related to Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados.TestWatchNotify.test_aio_notify to fail added
Actions #25

Updated by Sridhar Seshasayee over 2 years ago

/a/yuriw-2022-01-06_15:57:04-rados-wip-yuri6-testing-2022-01-05-1255-distro-default-smithi/6599449

2022-01-07T02:11:42.882 INFO:tasks.workunit.client.0.smithi067.stderr:======================================================================
2022-01-07T02:11:42.883 INFO:tasks.workunit.client.0.smithi067.stderr:FAIL: test_rados.TestWatchNotify.test
2022-01-07T02:11:42.883 INFO:tasks.workunit.client.0.smithi067.stderr:----------------------------------------------------------------------
2022-01-07T02:11:42.883 INFO:tasks.workunit.client.0.smithi067.stderr:Traceback (most recent call last):
2022-01-07T02:11:42.883 INFO:tasks.workunit.client.0.smithi067.stderr:  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
2022-01-07T02:11:42.884 INFO:tasks.workunit.client.0.smithi067.stderr:    self.test(*self.arg)
2022-01-07T02:11:42.884 INFO:tasks.workunit.client.0.smithi067.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1451, in test
2022-01-07T02:11:42.884 INFO:tasks.workunit.client.0.smithi067.stderr:    eq(self.notify_cnt[watch_id1], 3)
2022-01-07T02:11:42.884 INFO:tasks.workunit.client.0.smithi067.stderr:AssertionError: 5 != 3
Actions #26

Updated by Laura Flores about 2 years ago

/a/yuriw-2022-01-14_23:22:09-rados-wip-yuri6-testing-2022-01-14-1207-distro-default-smithi/6617813

Actions #27

Updated by Neha Ojha about 2 years ago

  • Assignee set to Laura Flores
Actions #28

Updated by Laura Flores about 2 years ago

/a/yuriw-2022-02-21_15:40:41-rados-wip-yuri4-testing-2022-02-18-0800-distro-default-smithi/6698327

Actions #29

Updated by Laura Flores about 2 years ago

/a/yuriw-2022-03-01_17:45:51-rados-wip-yuri3-testing-2022-02-28-0757-pacific-distro-default-smithi/6714724

Actions #30

Updated by Radoslaw Zarzynski about 2 years ago

  • Priority changed from High to Normal
Actions #31

Updated by Kamoltat (Junior) Sirivadhna about 2 years ago

/a/yuriw-2022-03-04_21:56:41-rados-wip-yuri4-testing-2022-03-03-1448-distro-default-smithi/6721329

Actions #32

Updated by Laura Flores almost 2 years ago

/a/yuriw-2022-04-27_02:52:22-rados-pacific-distro-default-smithi/6807766

2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:======================================================================
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:FAIL: test_rados.TestWatchNotify.test
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:----------------------------------------------------------------------
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:Traceback (most recent call last):
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:    self.test(*self.arg)
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1397, in test
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:    eq(self.notify_cnt[watch_id1], 2)
2022-04-27T03:49:25.748 INFO:tasks.workunit.client.0.smithi045.stderr:AssertionError: 3 != 2
2022-04-27T03:49:25.748 INFO:tasks.workunit.client.0.smithi045.stderr:
2022-04-27T03:49:25.748 INFO:tasks.workunit.client.0.smithi045.stderr:----------------------------------------------------------------------
Actions #33

Updated by Nitzan Mordechai almost 2 years ago

/home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28-1007-distro-default-smithi/6813940

Actions #35

Updated by Laura Flores almost 2 years ago

I reproduced the symptoms of this bug locally by incrementing the notify count before an eq check. The extra increment causes the "shutdown" rados state and the Assertion failure. This indicates that the notify_cnt is getting incremented at some point when it shouldn't.

diff in the code (for testing)

 assert(self.ioctx.notify(self.OID, 'best'))
 with self.lock:
+    self.notify_cnt[watch_id1] += 1 # for debugging
     eq(self.notify_cnt[watch_id1], 2)
     eq(self.notify_cnt[watch_id2], 2)
     eq(self.notify_data[watch_id1], b'best')
     eq(self.notify_data[watch_id2], b'best')

../qa/workunits/rados/test_python.sh

test_rados.TestWatchNotify.test ... FAIL
Traceback (most recent call last):
  File "rados.pyx", line 482, in rados.Rados.require_state
rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
Exception ignored in: 'rados.Watch.__dealloc__'
Traceback (most recent call last):
  File "rados.pyx", line 482, in rados.Rados.require_state
rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
test_rados.TestWatchNotify.test_aio_notify ... ok
test_rados.test_rados_init_error ... ok
test_rados.test_rados_init ... ok
test_rados.test_ioctx_context_manager ... ok
test_rados.test_parse_argv ... ok
test_rados.test_parse_argv_empty_str ... ok

======================================================================
FAIL: test_rados.TestWatchNotify.test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/lflores/ceph/src/test/pybind/test_rados.py", line 1443, in test
    eq(self.notify_cnt[watch_id1], 2)
AssertionError: 3 != 2

A previous patch (tracked by #52553) tried fixing this symptom by modifying dealloc, but the problem persists. So, a fix might involve decrementing notify_cnt whenever a false "notify" is counted.

Actions #37

Updated by Radoslaw Zarzynski almost 2 years ago

Let me paste a Laura's comment from https://github.com/ceph/ceph/pull/45825:

@NitzanMordhai perhaps similar logic can be applied for this Tracker: https://tracker.ceph.com/issues/45721. Let me know if you have any thoughts!

Actions #38

Updated by Laura Flores almost 2 years ago

/a/yuriw-2022-05-26_23:23:48-rados-wip-yuri2-testing-2022-05-26-1430-quincy-distro-default-smithi/6849426$

2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:======================================================================
2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:FAIL: test_rados.TestWatchNotify.test
2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:----------------------------------------------------------------------
2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:Traceback (most recent call last):
2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:    self.test(*self.arg)
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1451, in test
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:    eq(self.notify_cnt[watch_id1], 3)
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:AssertionError: 4 != 3
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:----------------------------------------------------------------------
Actions #39

Updated by Nitzan Mordechai almost 2 years ago

That could work, but when we have socket failure injection, the error callback will not be calling in the python API (unlike librados)
I think we need to check the API calls itself (and your suggestion for the test fix)

Actions #40

Updated by Laura Flores almost 2 years ago

Nitzan Mordechai wrote:

That could work, but when we have socket failure injection, the error callback will not be calling in the python API (unlike librados)
I think we need to check the API calls itself (and your suggestion for the test fix)

Thanks for the insight. Still happening occasionally in teuthology runs, so I will try out some potential fixes.

/a/yuriw-2022-06-10_03:10:47-rados-wip-yuri4-testing-2022-06-09-1510-quincy-distro-default-smithi/6871849

Actions #41

Updated by Sridhar Seshasayee over 1 year ago

Seeing this in a Quincy run:
/a/yuriw-2022-08-08_22:19:32-rados-wip-yuri4-testing-2022-08-08-1009-quincy-distro-default-smithi/6961889

Actions #42

Updated by Radoslaw Zarzynski over 1 year ago

  • Assignee changed from Laura Flores to Nitzan Mordechai
Actions #43

Updated by Nitzan Mordechai over 1 year ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 47668
Actions #44

Updated by Radoslaw Zarzynski over 1 year ago

  • Backport changed from octopus, pacific to pacific,quincy
Actions #45

Updated by Laura Flores over 1 year ago

  • Translation missing: en.field_tag_list set to test-failure
Actions #46

Updated by Neha Ojha over 1 year ago

  • Status changed from Fix Under Review to Pending Backport
Actions #47

Updated by Backport Bot over 1 year ago

  • Copied to Backport #57544: pacific: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.test added
Actions #48

Updated by Backport Bot over 1 year ago

  • Copied to Backport #57545: quincy: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.test added
Actions #49

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #50

Updated by Konstantin Shalygin 5 months ago

  • Status changed from Pending Backport to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF