Project

General

Profile

Actions

Bug #53767

closed

qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing causes timeout

Added by Laura Flores over 2 years ago. Updated 2 days ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Description: rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-health} mon_election/connectivity msgr-failures/few msgr/async objectstore/bluestore-comp-zstd rados tasks/rados_cls_all validater/valgrind}

Failure reason:

Command failed (workunit test cls/test_cls_2pc_queue.sh) on smithi023 with status 124: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=923a78b748f3bb78722c7300318f17cf5730a2ce TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 CEPH_MNT=/home/ubuntu/cephtest/mnt.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/cls/test_cls_2pc_queue.sh' 

/a/yuriw-2021-12-23_16:50:03-rados-wip-yuri6-testing-2021-12-22-1410-distro-default-smithi/6582533

2021-12-23T20:26:35.417 DEBUG:teuthology.orchestra.run.smithi023:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.1.asok dump_ops_in_flight
2021-12-23T20:26:35.418 INFO:teuthology.orchestra.run.smithi023.stderr:nodeep-scrub is unset
2021-12-23T20:26:35.419 INFO:tasks.ceph.osd.6.smithi125.stderr:2021-12-23T20:26:35.419+0000 ee6a700 -1 received  signal: Hangup from /usr/bin/python3 /bin/daemon-helper term env OPENSSL_ia32cap=~0x1000000000000000 valgrind --trace-childre
n=no --child-silent-after-fork=yes --soname-synonyms=somalloc=*tcmalloc* --num-callers=50 --suppressions=/home/ubuntu/cephtest/valgrind.supp --xml=yes --xml-file=/var/log/ceph/valgrind/osd.6.log --time-stamp=yes --vgdb=yes --exit-on-first
-error=yes --error-exitcode=42 --tool=memcheck ceph-osd -f --cluster ceph -i 6  (PID: 35347) UID: 0
2021-12-23T20:26:35.518 INFO:tasks.ceph.osd.6.smithi125.stderr:2021-12-23T20:26:35.520+0000 ee6a700 -1 received  signal: Hangup from /usr/bin/python3 /bin/daemon-helper term env OPENSSL_ia32cap=~0x1000000000000000 valgrind --trace-childre
n=no --child-silent-after-fork=yes --soname-synonyms=somalloc=*tcmalloc* --num-callers=50 --suppressions=/home/ubuntu/cephtest/valgrind.supp --xml=yes --xml-file=/var/log/ceph/valgrind/osd.6.log --time-stamp=yes --vgdb=yes --exit-on-first
-error=yes --error-exitcode=42 --tool=memcheck ceph-osd -f --cluster ceph -i 6  (PID: 35347) UID: 0
2021-12-23T20:26:35.590 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph-c_923a78b748f3bb78722c7300318f17cf5730a2ce/qa/tasks/ceph_manager.py", line 189, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_923a78b748f3bb78722c7300318f17cf5730a2ce/qa/tasks/ceph_manager.py", line 1412, in _do_thrash
    self.choose_action()()
  File "/home/teuthworker/src/github.com_ceph_ceph-c_923a78b748f3bb78722c7300318f17cf5730a2ce/qa/tasks/ceph_manager.py", line 347, in kill_osd
    self.ceph_manager.kill_osd(osd)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_923a78b748f3bb78722c7300318f17cf5730a2ce/qa/tasks/ceph_manager.py", line 2977, in kill_osd
    self.ctx.daemons.get_daemon('osd', osd, self.cluster).stop()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_95a7d4799b562f3bbb5ec66107094963abd62fa1/teuthology/orchestra/daemon/state.py", line 139, in stop
    run.wait([self.proc], timeout=timeout)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_95a7d4799b562f3bbb5ec66107094963abd62fa1/teuthology/orchestra/run.py", line 473, in wait
    check_time()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_95a7d4799b562f3bbb5ec66107094963abd62fa1/teuthology/contextutil.py", line 133, in __call__
    raise MaxWhileTries(error_msg)
teuthology.exceptions.MaxWhileTries: reached maximum tries (50) after waiting for 300 seconds


Related issues 1 (1 open0 closed)

Is duplicate of RADOS - Bug #55809: "Leak_IndirectlyLost" valgrind report on mon.cNew

Actions
Actions

Also available in: Atom PDF