Project

General

Profile

Bug #45721

CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.test

Added by Brad Hubbard over 2 years ago. Updated 3 months ago.

Status:
Pending Backport
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
backport_processed
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi/5088170

2020-05-24T22:42:37.494 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.TestWatchNotify.test ... FAIL
2020-05-24T22:42:37.495 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.495 INFO:tasks.workunit.client.0.smithi082.stderr:  File "rados.pyx", line 800, in rados.Rados.require_state
2020-05-24T22:42:37.495 INFO:tasks.workunit.client.0.smithi082.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
2020-05-24T22:42:37.495 INFO:tasks.workunit.client.0.smithi082.stderr:Exception ignored in: 'rados.Watch.__dealloc__'
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:  File "rados.pyx", line 800, in rados.Rados.require_state
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.496 INFO:tasks.workunit.client.0.smithi082.stderr:  File "rados.pyx", line 800, in rados.Rados.require_state
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:Exception ignored in: 'rados.Watch.__dealloc__'
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:  File "rados.pyx", line 800, in rados.Rados.require_state
2020-05-24T22:42:37.497 INFO:tasks.workunit.client.0.smithi082.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
2020-05-24T22:42:37.504 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_rados_init_error ... ok
2020-05-24T22:42:37.583 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_rados_init ... ok
2020-05-24T22:42:37.607 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_ioctx_context_manager ... ok
2020-05-24T22:42:37.618 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_parse_argv ... ok
2020-05-24T22:42:37.623 INFO:tasks.workunit.client.0.smithi082.stderr:test_rados.test_parse_argv_empty_str ... ok
2020-05-24T22:42:37.624 INFO:tasks.workunit.client.0.smithi082.stderr:
2020-05-24T22:42:37.624 INFO:tasks.workunit.client.0.smithi082.stderr:======================================================================
2020-05-24T22:42:37.624 INFO:tasks.workunit.client.0.smithi082.stderr:FAIL: test_rados.TestWatchNotify.test
2020-05-24T22:42:37.624 INFO:tasks.workunit.client.0.smithi082.stderr:----------------------------------------------------------------------
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:Traceback (most recent call last):
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:    self.test(*self.arg)
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1290, in test
2020-05-24T22:42:37.625 INFO:tasks.workunit.client.0.smithi082.stderr:    eq(self.notify_cnt[watch_id1], 3)
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:AssertionError: 4 != 3
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:----------------------------------------------------------------------
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:Ran 81 tests in 293.733s
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:
2020-05-24T22:42:37.626 INFO:tasks.workunit.client.0.smithi082.stderr:FAILED (failures=1)
2020-05-24T22:42:37.645 DEBUG:teuthology.orchestra.run:got remote process result: 1
2020-05-24T22:42:37.646 INFO:tasks.workunit:Stopping ['rados/test_python.sh'] on client.0...
2020-05-24T22:42:37.646 INFO:teuthology.orchestra.run.smithi082:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2020-05-24T22:42:37.893 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/run_tasks.py", line 87, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/run_tasks.py", line 66, in run_one_task
    return task(**kwargs)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-yuri-master_5.24.20/qa/tasks/workunit.py", line 127, in task
    coverage_and_limits=not config.get('no_coverage_and_limits', None))
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/parallel.py", line 87, in __exit__
    for result in self:
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/parallel.py", line 101, in __next__
    resurrect_traceback(result)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/parallel.py", line 37, in resurrect_traceback
    reraise(*exc_info)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/parallel.py", line 24, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_wip-yuri-master_5.24.20/qa/tasks/workunit.py", line 415, in _run_tests
    label="workunit test {workunit}".format(workunit=workunit)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/orchestra/remote.py", line 206, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/orchestra/run.py", line 473, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/orchestra/run.py", line 162, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/orchestra/run.py", line 184, in _raise_for_status
    node=self.hostname, label=self.label
CommandFailedError: Command failed (workunit test rados/test_python.sh) on smithi082 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CE
PH_REF=9dce54cc5473ddca3d64786e557f4b0c097deed7 TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adj
ust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test_python.sh'

Related issues

Related to RADOS - Bug #52553: pybind: rados.RadosStateError raised when closed watch object goes out of scope after cluster shutdown Pending Backport
Related to RADOS - Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados.TestWatchNotify.test_aio_notify to fail In Progress
Copied to RADOS - Backport #57544: pacific: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.test In Progress
Copied to RADOS - Backport #57545: quincy: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.test In Progress

History

#1 Updated by Neha Ojha over 2 years ago

/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117777

#2 Updated by Kefu Chai over 2 years ago

2020-06-23T23:35:22.984 INFO:tasks.workunit.client.0.smithi107.stderr:======================================================================
2020-06-23T23:35:22.984 INFO:tasks.workunit.client.0.smithi107.stderr:FAIL: test_rados.TestWatchNotify.test
2020-06-23T23:35:22.984 INFO:tasks.workunit.client.0.smithi107.stderr:----------------------------------------------------------------------
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:Traceback (most recent call last):
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:    self.test(*self.arg)
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1281, in test
2020-06-23T23:35:22.985 INFO:tasks.workunit.client.0.smithi107.stderr:    eq(self.notify_cnt[watch_id1], 2)
2020-06-23T23:35:22.986 INFO:tasks.workunit.client.0.smithi107.stderr:AssertionError: 3 != 2

/a/kchai-2020-06-23_15:15:59-rados-wip-kefu-testing-2020-06-23-1353-distro-basic-smithi/5173648/

#3 Updated by Kefu Chai over 2 years ago

2020-06-27T08:17:29.361 INFO:tasks.workunit.client.0.smithi203.stderr:FAIL: test_rados.TestWatchNotify.test
2020-06-27T08:17:29.361 INFO:tasks.workunit.client.0.smithi203.stderr:----------------------------------------------------------------------
2020-06-27T08:17:29.362 INFO:tasks.workunit.client.0.smithi203.stderr:Traceback (most recent call last):
2020-06-27T08:17:29.362 INFO:tasks.workunit.client.0.smithi203.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2020-06-27T08:17:29.362 INFO:tasks.workunit.client.0.smithi203.stderr:    self.test(*self.arg)
2020-06-27T08:17:29.363 INFO:tasks.workunit.client.0.smithi203.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1281, in test
2020-06-27T08:17:29.363 INFO:tasks.workunit.client.0.smithi203.stderr:    eq(self.notify_cnt[watch_id1], 2)
2020-06-27T08:17:29.363 INFO:tasks.workunit.client.0.smithi203.stderr:AssertionError: 3 != 2

/a//kchai-2020-06-27_07:37:00-rados-wip-kefu-testing-2020-06-27-1407-distro-basic-smithi/5183650

#4 Updated by Neha Ojha over 2 years ago

rados/basic/{ceph clusters/{fixed-2 openstack} msgr-failures/many msgr/async-v2only objectstore/bluestore-comp-zstd rados supported-random-distro$/{centos_8} tasks/rados_python}

/a/nojha-2020-07-31_22:50:07-rados-wip-35628-2020-07-28-distro-basic-smithi/5273950

#5 Updated by Kefu Chai almost 2 years ago

2020-12-16T21:11:22.044 INFO:tasks.workunit.client.0.smithi167.stderr:FAIL: test_rados.TestWatchNotify.test
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:----------------------------------------------------------------------
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:Traceback (most recent call last):
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:    self.test(*self.arg)
2020-12-16T21:11:22.045 INFO:tasks.workunit.client.0.smithi167.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1348, in test
2020-12-16T21:11:22.046 INFO:tasks.workunit.client.0.smithi167.stderr:    eq(self.notify_cnt[watch_id1], 1)
2020-12-16T21:11:22.046 INFO:tasks.workunit.client.0.smithi167.stderr:AssertionError: 2 != 1

/a/kchai-2020-12-16_15:58:13-rados-wip-kefu-testing-2020-12-16-1740-distro-basic-smithi/5713555

#6 Updated by Neha Ojha almost 2 years ago

2021-02-02T02:15:24.737 INFO:tasks.workunit.client.0.smithi036.stderr:======================================================================
2021-02-02T02:15:24.737 INFO:tasks.workunit.client.0.smithi036.stderr:FAIL: test_rados.TestWatchNotify.test
2021-02-02T02:15:24.737 INFO:tasks.workunit.client.0.smithi036.stderr:----------------------------------------------------------------------
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:Traceback (most recent call last):
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:    self.test(*self.arg)
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1406, in test
2021-02-02T02:15:24.738 INFO:tasks.workunit.client.0.smithi036.stderr:    eq(self.notify_cnt[watch_id1], 3)
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:AssertionError: 5 != 3
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:----------------------------------------------------------------------
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:Ran 88 tests in 313.846s
2021-02-02T02:15:24.739 INFO:tasks.workunit.client.0.smithi036.stderr:
2021-02-02T02:15:24.740 INFO:tasks.workunit.client.0.smithi036.stderr:FAILED (failures=1)

/a/nojha-2021-02-01_21:31:14-rados-wip-39145-distro-basic-smithi/5846891

#7 Updated by Deepika Upadhyay over 1 year ago

2021-03-11T17:24:39.745 INFO:tasks.workunit.client.0.smithi123.stderr:======================================================================
2021-03-11T17:24:39.746 INFO:tasks.workunit.client.0.smithi123.stderr:FAIL: test_rados.TestWatchNotify.test
2021-03-11T17:24:39.746 INFO:tasks.workunit.client.0.smithi123.stderr:----------------------------------------------------------------------
2021-03-11T17:24:39.746 INFO:tasks.workunit.client.0.smithi123.stderr:Traceback (most recent call last):
2021-03-11T17:24:39.746 INFO:tasks.workunit.client.0.smithi123.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:    self.test(*self.arg)
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1276, in test
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:    eq(self.notify_cnt[watch_id1], 2)
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:AssertionError: 4 != 2
2021-03-11T17:24:39.747 INFO:tasks.workunit.client.0.smithi123.stderr:
2021-03-11T17:24:39.748 INFO:tasks.workunit.client.0.smithi123.stderr:----------------------------------------------------------------------
2021-03-11T17:24:39.748 INFO:tasks.workunit.client.0.smithi123.stderr:Ran 80 tests in 289.328s

/ceph/teuthology-archive/yuriw-2021-03-09_20:27:38-rados-wip-yuri2-testing-2021-03-09-1006-octopus-distro-basic-smithi/5950464/teuthology.log
/ceph/teuthology-archive/yuriw-2021-03-09_20:28:34-rados-wip-yuri4-testing-2021-03-09-1006-octopus-distro-basic-smithi/5951087/teuthology.log

#8 Updated by Sage Weil over 1 year ago

/a/sage-2021-04-02_14:05:32-rados-wip-sage2-testing-2021-04-01-1417-pacific-distro-basic-smithi/6017345

#9 Updated by Sage Weil over 1 year ago

  • Priority changed from Normal to High

#10 Updated by Kefu Chai over 1 year ago

2021-04-05T12:33:34.093 INFO:tasks.workunit.client.0.gibba010.stderr:======================================================================
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:FAIL: test_rados.TestWatchNotify.test
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:----------------------------------------------------------------------
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:Traceback (most recent call last):
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2021-04-05T12:33:34.094 INFO:tasks.workunit.client.0.gibba010.stderr:    self.test(*self.arg)
2021-04-05T12:33:34.095 INFO:tasks.workunit.client.0.gibba010.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1397, in test
2021-04-05T12:33:34.095 INFO:tasks.workunit.client.0.gibba010.stderr:    eq(self.notify_cnt[watch_id1], 2)
2021-04-05T12:33:34.095 INFO:tasks.workunit.client.0.gibba010.stderr:AssertionError: 3 != 2
2021-04-05T12:33:34.095 INFO:tasks.workunit.client.0.gibba010.stderr:
2021-04-05T12:33:34.096 INFO:tasks.workunit.client.0.gibba010.stderr:----------------------------------------------------------------------

/a/kchai-2021-04-05_10:46:36-rados-wip-kefu-testing-2021-04-05-1650-distro-basic-gibba/6021723/

#11 Updated by Greg Farnum over 1 year ago

https://pulpito.ceph.com/gregf-2021-04-13_09:22:15-rados-wip-stretch-mon-state-412-distro-basic-smithi/6043691

======================================================================
FAIL: test_rados.TestWatchNotify.test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1406, in test
    eq(self.notify_cnt[watch_id1], 3)
AssertionError: 4 != 3

#12 Updated by Neha Ojha over 1 year ago

  • Backport set to pacific

/a/yuriw-2021-04-30_12:58:14-rados-wip-yuri2-testing-2021-04-29-1501-pacific-distro-basic-smithi/6086154

#13 Updated by Deepika Upadhyay over 1 year ago

  • Backport changed from pacific to octopus, pacific
2021-05-14T00:02:12.634 INFO:tasks.workunit.client.0.smithi028.stderr:Exception ignored in: 'rados.Watch.__dealloc__'
2021-05-14T00:02:12.634 INFO:tasks.workunit.client.0.smithi028.stderr:Traceback (most recent call last):
2021-05-14T00:02:12.635 INFO:tasks.workunit.client.0.smithi028.stderr:  File "rados.pyx", line 799, in rados.Rados.require_state
2021-05-14T00:02:12.635 INFO:tasks.workunit.client.0.smithi028.stderr:rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state s

2021-05-14T00:02:12.729 INFO:tasks.workunit.client.0.smithi028.stderr:======================================================================
2021-05-14T00:02:12.729 INFO:tasks.workunit.client.0.smithi028.stderr:FAIL: test_rados.TestWatchNotify.test
2021-05-14T00:02:12.729 INFO:tasks.workunit.client.0.smithi028.stderr:----------------------------------------------------------------------
2021-05-14T00:02:12.729 INFO:tasks.workunit.client.0.smithi028.stderr:Traceback (most recent call last):
2021-05-14T00:02:12.730 INFO:tasks.workunit.client.0.smithi028.stderr:  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
2021-05-14T00:02:12.730 INFO:tasks.workunit.client.0.smithi028.stderr:    self.test(*self.arg)
2021-05-14T00:02:12.730 INFO:tasks.workunit.client.0.smithi028.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1276, in test
2021-05-14T00:02:12.730 INFO:tasks.workunit.client.0.smithi028.stderr:    eq(self.notify_cnt[watch_id1], 2)
2021-05-14T00:02:12.731 INFO:tasks.workunit.client.0.smithi028.stderr:AssertionError: 3 != 2

/ceph/teuthology-archive/yuriw-2021-05-13_23:05:56-rados-wip-yuri4-testing-2021-05-13-1301-octopus-distro-basic-smithi/6114277/teuthology.log

#14 Updated by Neha Ojha over 1 year ago

/a/yuriw-2021-05-25_19:21:19-rados-wip-yuri2-testing-2021-05-25-0940-pacific-distro-basic-smithi/6134490

#15 Updated by Neha Ojha over 1 year ago

/a/yuriw-2021-06-28_17:32:48-rados-wip-yuri2-testing-2021-06-28-0858-pacific-distro-basic-smithi/6239575

#17 Updated by Neha Ojha over 1 year ago

/a/yuriw-2021-07-02_17:35:44-rados-pacific-distro-basic-smithi/6249971

#19 Updated by Brad Hubbard over 1 year ago

I've modified this test to only run the TestWatchNotify subtests (2) and to generate debug logging. I'll report back what, if anything, I find out.

#20 Updated by Neha Ojha over 1 year ago

/a/yuriw-2021-08-06_16:31:19-rados-wip-yuri-master-8.6.21-distro-basic-smithi/6324539

#21 Updated by Neha Ojha over 1 year ago

/a/yuriw-2021-08-23_19:24:05-rados-wip-yuri4-testing-2021-08-23-0812-pacific-distro-basic-smithi/6353883

#22 Updated by Deepika Upadhyay about 1 year ago

2021-09-17T07:03:13.050 INFO:tasks.workunit.client.0.smithi026.stderr:FAIL: test_rados.TestWatchNotify.test
2021-09-17T07:03:13.051 INFO:tasks.workunit.client.0.smithi026.stderr:----------------------------------------------------------------------
2021-09-17T07:03:13.051 INFO:tasks.workunit.client.0.smithi026.stderr:Traceback (most recent call last):
2021-09-17T07:03:13.051 INFO:tasks.workunit.client.0.smithi026.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2021-09-17T07:03:13.051 INFO:tasks.workunit.client.0.smithi026.stderr:    self.test(*self.arg)
2021-09-17T07:03:13.052 INFO:tasks.workunit.client.0.smithi026.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1285, in test
2021-09-17T07:03:13.052 INFO:tasks.workunit.client.0.smithi026.stderr:    eq(self.notify_cnt[watch_id1], 3)
2021-09-17T07:03:13.052 INFO:tasks.workunit.client.0.smithi026.stderr:AssertionError: 4 != 3

/ceph/teuthology-archive/yuriw-2021-09-16_21:30:19-rados-wip-yuri5-testing-2021-09-16-1056-octopus-distro-basic-smithi/6394002/teuthology.log

#23 Updated by Deepika Upadhyay about 1 year ago

  • Related to Bug #52553: pybind: rados.RadosStateError raised when closed watch object goes out of scope after cluster shutdown added

#24 Updated by Laura Flores 11 months ago

  • Related to Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados.TestWatchNotify.test_aio_notify to fail added

#25 Updated by Sridhar Seshasayee 11 months ago

/a/yuriw-2022-01-06_15:57:04-rados-wip-yuri6-testing-2022-01-05-1255-distro-default-smithi/6599449

2022-01-07T02:11:42.882 INFO:tasks.workunit.client.0.smithi067.stderr:======================================================================
2022-01-07T02:11:42.883 INFO:tasks.workunit.client.0.smithi067.stderr:FAIL: test_rados.TestWatchNotify.test
2022-01-07T02:11:42.883 INFO:tasks.workunit.client.0.smithi067.stderr:----------------------------------------------------------------------
2022-01-07T02:11:42.883 INFO:tasks.workunit.client.0.smithi067.stderr:Traceback (most recent call last):
2022-01-07T02:11:42.883 INFO:tasks.workunit.client.0.smithi067.stderr:  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
2022-01-07T02:11:42.884 INFO:tasks.workunit.client.0.smithi067.stderr:    self.test(*self.arg)
2022-01-07T02:11:42.884 INFO:tasks.workunit.client.0.smithi067.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1451, in test
2022-01-07T02:11:42.884 INFO:tasks.workunit.client.0.smithi067.stderr:    eq(self.notify_cnt[watch_id1], 3)
2022-01-07T02:11:42.884 INFO:tasks.workunit.client.0.smithi067.stderr:AssertionError: 5 != 3

#26 Updated by Laura Flores 11 months ago

/a/yuriw-2022-01-14_23:22:09-rados-wip-yuri6-testing-2022-01-14-1207-distro-default-smithi/6617813

#27 Updated by Neha Ojha 10 months ago

  • Assignee set to Laura Flores

#28 Updated by Laura Flores 9 months ago

/a/yuriw-2022-02-21_15:40:41-rados-wip-yuri4-testing-2022-02-18-0800-distro-default-smithi/6698327

#29 Updated by Laura Flores 9 months ago

/a/yuriw-2022-03-01_17:45:51-rados-wip-yuri3-testing-2022-02-28-0757-pacific-distro-default-smithi/6714724

#30 Updated by Radoslaw Zarzynski 9 months ago

  • Priority changed from High to Normal

#31 Updated by Kamoltat (Junior) Sirivadhna 9 months ago

/a/yuriw-2022-03-04_21:56:41-rados-wip-yuri4-testing-2022-03-03-1448-distro-default-smithi/6721329

#32 Updated by Laura Flores 7 months ago

/a/yuriw-2022-04-27_02:52:22-rados-pacific-distro-default-smithi/6807766

2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:======================================================================
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:FAIL: test_rados.TestWatchNotify.test
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:----------------------------------------------------------------------
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:Traceback (most recent call last):
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:  File "/usr/lib/python3/dist-packages/nose/case.py", line 197, in runTest
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:    self.test(*self.arg)
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1397, in test
2022-04-27T03:49:25.747 INFO:tasks.workunit.client.0.smithi045.stderr:    eq(self.notify_cnt[watch_id1], 2)
2022-04-27T03:49:25.748 INFO:tasks.workunit.client.0.smithi045.stderr:AssertionError: 3 != 2
2022-04-27T03:49:25.748 INFO:tasks.workunit.client.0.smithi045.stderr:
2022-04-27T03:49:25.748 INFO:tasks.workunit.client.0.smithi045.stderr:----------------------------------------------------------------------

#33 Updated by Nitzan Mordechai 7 months ago

/home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28-1007-distro-default-smithi/6813940

#35 Updated by Laura Flores 7 months ago

I reproduced the symptoms of this bug locally by incrementing the notify count before an eq check. The extra increment causes the "shutdown" rados state and the Assertion failure. This indicates that the notify_cnt is getting incremented at some point when it shouldn't.

diff in the code (for testing)

 assert(self.ioctx.notify(self.OID, 'best'))
 with self.lock:
+    self.notify_cnt[watch_id1] += 1 # for debugging
     eq(self.notify_cnt[watch_id1], 2)
     eq(self.notify_cnt[watch_id2], 2)
     eq(self.notify_data[watch_id1], b'best')
     eq(self.notify_data[watch_id2], b'best')

../qa/workunits/rados/test_python.sh

test_rados.TestWatchNotify.test ... FAIL
Traceback (most recent call last):
  File "rados.pyx", line 482, in rados.Rados.require_state
rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
Exception ignored in: 'rados.Watch.__dealloc__'
Traceback (most recent call last):
  File "rados.pyx", line 482, in rados.Rados.require_state
rados.RadosStateError: RADOS rados state (You cannot perform that operation on a Rados object in state shutdown.)
test_rados.TestWatchNotify.test_aio_notify ... ok
test_rados.test_rados_init_error ... ok
test_rados.test_rados_init ... ok
test_rados.test_ioctx_context_manager ... ok
test_rados.test_parse_argv ... ok
test_rados.test_parse_argv_empty_str ... ok

======================================================================
FAIL: test_rados.TestWatchNotify.test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/lflores/ceph/src/test/pybind/test_rados.py", line 1443, in test
    eq(self.notify_cnt[watch_id1], 2)
AssertionError: 3 != 2

A previous patch (tracked by #52553) tried fixing this symptom by modifying dealloc, but the problem persists. So, a fix might involve decrementing notify_cnt whenever a false "notify" is counted.

#37 Updated by Radoslaw Zarzynski 6 months ago

Let me paste a Laura's comment from https://github.com/ceph/ceph/pull/45825:

@NitzanMordhai perhaps similar logic can be applied for this Tracker: https://tracker.ceph.com/issues/45721. Let me know if you have any thoughts!

#38 Updated by Laura Flores 6 months ago

/a/yuriw-2022-05-26_23:23:48-rados-wip-yuri2-testing-2022-05-26-1430-quincy-distro-default-smithi/6849426$

2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:======================================================================
2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:FAIL: test_rados.TestWatchNotify.test
2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:----------------------------------------------------------------------
2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:Traceback (most recent call last):
2022-05-26T23:55:09.585 INFO:tasks.workunit.client.0.smithi105.stderr:  File "/usr/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:    self.test(*self.arg)
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:  File "/home/ubuntu/cephtest/clone.client.0/src/test/pybind/test_rados.py", line 1451, in test
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:    eq(self.notify_cnt[watch_id1], 3)
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:AssertionError: 4 != 3
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:
2022-05-26T23:55:09.586 INFO:tasks.workunit.client.0.smithi105.stderr:----------------------------------------------------------------------

#39 Updated by Nitzan Mordechai 6 months ago

That could work, but when we have socket failure injection, the error callback will not be calling in the python API (unlike librados)
I think we need to check the API calls itself (and your suggestion for the test fix)

#40 Updated by Laura Flores 6 months ago

Nitzan Mordechai wrote:

That could work, but when we have socket failure injection, the error callback will not be calling in the python API (unlike librados)
I think we need to check the API calls itself (and your suggestion for the test fix)

Thanks for the insight. Still happening occasionally in teuthology runs, so I will try out some potential fixes.

/a/yuriw-2022-06-10_03:10:47-rados-wip-yuri4-testing-2022-06-09-1510-quincy-distro-default-smithi/6871849

#41 Updated by Sridhar Seshasayee 4 months ago

Seeing this in a Quincy run:
/a/yuriw-2022-08-08_22:19:32-rados-wip-yuri4-testing-2022-08-08-1009-quincy-distro-default-smithi/6961889

#42 Updated by Radoslaw Zarzynski 4 months ago

  • Assignee changed from Laura Flores to Nitzan Mordechai

#43 Updated by Nitzan Mordechai 4 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 47668

#44 Updated by Radoslaw Zarzynski 3 months ago

  • Backport changed from octopus, pacific to pacific,quincy

#45 Updated by Laura Flores 3 months ago

  • Tags set to test-failure

#46 Updated by Neha Ojha 3 months ago

  • Status changed from Fix Under Review to Pending Backport

#47 Updated by Backport Bot 3 months ago

  • Copied to Backport #57544: pacific: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.test added

#48 Updated by Backport Bot 3 months ago

  • Copied to Backport #57545: quincy: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_rados.TestWatchNotify.test added

#49 Updated by Backport Bot 3 months ago

  • Tags set to backport_processed

Also available in: Atom PDF