Project

General

Profile

Actions

Bug #64752

open

cephfs-mirror: valgrind report leaks

Added by Venky Shankar about 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
High
Assignee:
Category:
Performance/Resource Usage
Target version:
% Done:

0%

Source:
Tags:
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
cephfs-mirror
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2024-03-01_20:51:20-fs-squid-distro-default-smithi/7578146

Description: fs/valgrind/{begin/{0-install 1-ceph 2-logrotate} centos_latest debug mirror/{cephfs-mirror/one-per-cluster clients/mirror cluster/1-node mount/fuse overrides/ignorelist_health tasks/mirror}}
Actions #1

Updated by Jos Collin about 2 months ago

The test test_peer_commands_with_mirroring_disabled passes, but then in the unwinding process, there's a CommandFailedError

 66049 2024-03-02T01:36:00.159 INFO:teuthology.orchestra.run:Running command with timeout 300
 66050 2024-03-02T01:36:00.159 DEBUG:teuthology.orchestra.run.smithi104:> ip netns list
 66051 2024-03-02T01:36:00.214 INFO:teuthology.orchestra.run.smithi104.stdout:ceph-ns--home-ubuntu-cephtest-mnt.1 (id: 1)
 66052 2024-03-02T01:36:00.214 INFO:teuthology.orchestra.run.smithi104.stdout:ceph-ns--home-ubuntu-cephtest-mnt.0 (id: 0)
 66053 2024-03-02T01:36:00.214 INFO:teuthology.orchestra.run:Running command with timeout 300
 66054 2024-03-02T01:36:00.214 DEBUG:teuthology.orchestra.run.smithi104:> sudo ip netns delete ceph-ns--home-ubuntu-cephtest-mnt.1
 66055 2024-03-02T01:36:00.282 INFO:teuthology.orchestra.run:Running command with timeout 300
 66056 2024-03-02T01:36:00.282 DEBUG:teuthology.orchestra.run.smithi104:> sudo ip netns delete ceph-ns--home-ubuntu-cephtest-mnt.0
 66057 2024-03-02T01:36:00.350 INFO:teuthology.orchestra.run:Running command with timeout 300
 66058 2024-03-02T01:36:00.350 DEBUG:teuthology.orchestra.run.smithi104:> sudo ip link delete ceph-brx
 66059 2024-03-02T01:36:00.446 DEBUG:teuthology.run_tasks:Unwinding manager cephfs-mirror
 66060 2024-03-02T01:36:00.456 DEBUG:tasks.cephfs_mirror.client.mirror:waiting for process to exit
 66061 2024-03-02T01:36:00.456 INFO:teuthology.orchestra.run:waiting for 300
 66062 2024-03-02T01:36:00.557 DEBUG:teuthology.orchestra.run:got remote process result: 42
 66063 2024-03-02T01:36:01.531 DEBUG:teuthology.orchestra.run.smithi104:> sudo logrotate /etc/logrotate.d/ceph-test.conf
 66064 2024-03-02T01:36:31.591 DEBUG:teuthology.orchestra.run.smithi104:> sudo logrotate /etc/logrotate.d/ceph-test.conf
 66065 2024-03-02T01:37:00.558 DEBUG:teuthology.orchestra.run:timed out waiting; will kill: <Greenlet at 0x7fc4ed7f4f40: copy_file_to\
       (<paramiko.ChannelFile from <paramiko.Channel 183 (, <Logger tasks.cephfs_mirror.client.mirror.smithi10, None, False)>
 66066 2024-03-02T01:37:01.727 DEBUG:teuthology.orchestra.run.smithi104:> sudo logrotate /etc/logrotate.d/ceph-test.conf
 66067 2024-03-02T01:37:31.813 DEBUG:teuthology.orchestra.run.smithi104:> sudo logrotate /etc/logrotate.d/ceph-test.conf
 66068 2024-03-02T01:38:00.559 DEBUG:teuthology.orchestra.run:timed out waiting; will kill: <Greenlet at 0x7fc4ed80e680: copy_file_to\
       (<paramiko.ChannelFile from <paramiko.Channel 183 (, <Logger tasks.cephfs_mirror.client.mirror.smithi10, None, False)>
 66069 2024-03-02T01:38:00.560 ERROR:teuthology.orchestra.daemon.state:Error while waiting for process to exit
 66070 Traceback (most recent call last):
 66071   File "/home/teuthworker/src/git.ceph.com_teuthology_b1dac5519c57c269ea98a22fb7729016a1d0e4d2/teuthology/orchestra/daemon/sta\
       te.py", line 139, in stop
 66072     run.wait([self.proc], timeout=timeout)
 66073   File "/home/teuthworker/src/git.ceph.com_teuthology_b1dac5519c57c269ea98a22fb7729016a1d0e4d2/teuthology/orchestra/run.py", l\
       ine 479, in wait
 66074     proc.wait()
 66075   File "/home/teuthworker/src/git.ceph.com_teuthology_b1dac5519c57c269ea98a22fb7729016a1d0e4d2/teuthology/orchestra/run.py", l\
       ine 161, in wait
 66076     self._raise_for_status()
 66077   File "/home/teuthworker/src/git.ceph.com_teuthology_b1dac5519c57c269ea98a22fb7729016a1d0e4d2/teuthology/orchestra/run.py", l\
       ine 181, in _raise_for_status
 66078     raise CommandFailedError(
 66079 teuthology.exceptions.CommandFailedError: Command failed on smithi104 with status 42: "cd /home/ubuntu/cephtest && adjust-ulim\
       its ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper term env 'OPENSSL_ia32cap=~0x1000000000000000' valgrind\
        --trace-children=no --child-silent-after-fork=yes '--soname-synonyms=somalloc=*tcmalloc*' --num-callers=50 --suppressions=/ho\
       me/ubuntu/cephtest/valgrind.supp --xml=yes --xml-file=/var/log/ceph/valgrind/cephfs-mirror-client.mirror.log --time-stamp=yes \
       --vgdb=yes --exit-on-first-error=yes --error-exitcode=42 --tool=memcheck --leak-check=full --show-reachable=yes cephfs-mirror \
       --cluster ceph --id mirror" 
 66080 2024-03-02T01:38:00.561 INFO:tasks.cephfs_mirror.client.mirror:Stopped
 66081 2024-03-02T01:38:00.561 DEBUG:teuthology.run_tasks:Unwinding manager ceph
 66082 2024-03-02T01:38:00.573 INFO:tasks.ceph.ceph_manager.ceph:waiting for clean
 66083 2024-03-02T01:38:00.574 DEBUG:teuthology.orchestra.run.smithi104:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/arc\
       hive/coverage timeout 120 ceph --cluster ceph pg dump --format=json
 66084 2024-03-02T01:38:00.870 INFO:teuthology.orchestra.run.smithi104.stdout:
 66085 2024-03-02T01:38:00.870 INFO:teuthology.orchestra.run.smithi104.stderr:dumped all

followed by:

 66185 2024-03-02T01:38:27.163 DEBUG:teuthology.orchestra.run.smithi104:> sudo chmod 0666 /tmp/tmp.BDoV69tNRK
 66186 2024-03-02T01:38:27.261 DEBUG:teuthology.orchestra.remote:smithi104:/tmp/tmp.BDoV69tNRK is 4MB
 66187 2024-03-02T01:38:27.395 DEBUG:teuthology.orchestra.run.smithi104:> rm -fr /tmp/tmp.BDoV69tNRK
 66188 2024-03-02T01:38:27.421 INFO:tasks.ceph:Cleaning ceph cluster...
 66189 2024-03-02T01:38:27.421 DEBUG:teuthology.orchestra.run.smithi104:> sudo rm -rf -- /etc/ceph/ceph.conf /etc/ceph/ceph.keyring /\
       home/ubuntu/cephtest/ceph.data /home/ubuntu/cephtest/ceph.monmap /home/ubuntu/cephtest/../*.pid
 66190 2024-03-02T01:38:27.623 DEBUG:tasks.ceph:valgrind exception message: valgrind error: Leak_StillReachable
 66191 malloc
 66192 malloc
 66193 strdup

caused:

raise valgrind_exception

Actions

Also available in: Atom PDF