Actions
Bug #55620
openceph pacific fails to perform fs/multifs test
Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Tags:
backport_processed
Backport:
quincy, pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
MDS, MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
During execution of the integration tests (IBM Z, BE) the fs/multifs suite produces a set of error related to segfaults.
teuthology.log:
2021-12-24T16:29:37.961 INFO:teuthology.orchestra.run.m1306035.stderr:2021-12-24T16:29:37.952+0100 3ff7f7fe900 1 -- 172.18.232.35:0/1713868722 <== mon.2 v2:172.18.232.30:3301/0 2 ==== config(0 keys) v1 = === 4+0+0 (secure 0 0 0) 0x3ff8c003d90 con 0x3ff7806b290 2021-12-24T16:29:37.962 INFO:teuthology.orchestra.run.m1306035.stderr:2021-12-24T16:29:37.952+0100 3ff7f7fe900 1 -- 172.18.232.35:0/1713868722 <== mon.2 v2:172.18.232.30:3301/0 3 ==== mgrmap(e 4) v1 ==== 70370+0+0 (secure 0 0 0) 0x3ff8c0220c0 con 0x3ff7806b290 2021-12-24T16:29:38.298 DEBUG:teuthology.orchestra.run:got remote process result: 124 2021-12-24T16:29:38.299 INFO:tasks.cephfs_test_runner:test_mount_all_caps_absent (tasks.cephfs.test_multifs_auth.TestClientsWithoutAuth) ... ERROR 2021-12-24T16:29:38.300 INFO:tasks.cephfs_test_runner: 2021-12-24T16:29:38.300 INFO:tasks.cephfs_test_runner:====================================================================== 2021-12-24T16:29:38.300 INFO:tasks.cephfs_test_runner:ERROR: test_mount_all_caps_absent (tasks.cephfs.test_multifs_auth.TestClientsWithoutAuth) 2021-12-24T16:29:38.300 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2021-12-24T16:29:38.300 INFO:tasks.cephfs_test_runner:Traceback (most recent call last): 2021-12-24T16:29:38.300 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs/cephfs_test_case.py", line 212, in tearDown 2021-12-24T16:29:38.300 INFO:tasks.cephfs_test_runner: self.mds_cluster.delete_all_filesystems() 2021-12-24T16:29:38.301 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs/filesystem.py", line 479, in del ete_all_filesystems 2021-12-24T16:29:38.301 INFO:tasks.cephfs_test_runner: for fs in self.status().get_filesystems(): 2021-12-24T16:29:38.301 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs/filesystem.py", line 382, in sta tus 2021-12-24T16:29:38.301 INFO:tasks.cephfs_test_runner: return FSStatus(self.mon_manager, epoch) 2021-12-24T16:29:38.301 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs/filesystem.py", line 79, in __in it__ 2021-12-24T16:29:38.301 INFO:tasks.cephfs_test_runner: self.map = json.loads(self.mon.raw_cluster_cmd(*cmd)) 2021-12-24T16:29:38.301 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/ceph_manager.py", line 1581, in raw_clu ster_cmd 2021-12-24T16:29:38.301 INFO:tasks.cephfs_test_runner: p = self.run_cluster_cmd(args=args, stdout=stdout, **kwargs) 2021-12-24T16:29:38.302 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/ceph_manager.py", line 1574, in run_clu ster_cmd 2021-12-24T16:29:38.302 INFO:tasks.cephfs_test_runner: return self.controller.run(**kwargs) 2021-12-24T16:29:38.302 INFO:tasks.cephfs_test_runner: File "/home/teuthology/src/teuthology_pacific/teuthology/orchestra/remote.py", line 509, in run 2021-12-24T16:29:38.302 INFO:tasks.cephfs_test_runner: r = self._runner(client=self.ssh, name=self.shortname, **kwargs) 2021-12-24T16:29:38.302 INFO:tasks.cephfs_test_runner: File "/home/teuthology/src/teuthology_pacific/teuthology/orchestra/run.py", line 455, in run 2021-12-24T16:29:38.302 INFO:tasks.cephfs_test_runner: r.wait() 2021-12-24T16:29:38.302 INFO:tasks.cephfs_test_runner: File "/home/teuthology/src/teuthology_pacific/teuthology/orchestra/run.py", line 161, in wait 2021-12-24T16:29:38.302 INFO:tasks.cephfs_test_runner: self._raise_for_status() 2021-12-24T16:29:38.303 INFO:tasks.cephfs_test_runner: File "/home/teuthology/src/teuthology_pacific/teuthology/orchestra/run.py", line 181, in _raise_for_status 2021-12-24T16:29:38.303 INFO:tasks.cephfs_test_runner: raise CommandFailedError( 2021-12-24T16:29:38.303 INFO:tasks.cephfs_test_runner:teuthology.exceptions.CommandFailedError: Command failed on m1306035 with status 124: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive /coverage timeout 120 ceph --cluster ceph fs dump --format=json' 2021-12-24T16:29:38.303 INFO:tasks.cephfs_test_runner: 2021-12-24T16:29:38.303 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2021-12-24T16:29:38.303 INFO:tasks.cephfs_test_runner:Ran 1 test in 164.104s 2021-12-24T16:29:38.303 INFO:tasks.cephfs_test_runner: 2021-12-24T16:29:38.303 INFO:tasks.cephfs_test_runner:FAILED (errors=1) 2021-12-24T16:29:38.304 INFO:tasks.cephfs_test_runner: 2021-12-24T16:29:38.304 INFO:tasks.cephfs_test_runner:====================================================================== 2021-12-24T16:29:38.304 INFO:tasks.cephfs_test_runner:ERROR: test_mount_all_caps_absent (tasks.cephfs.test_multifs_auth.TestClientsWithoutAuth) 2021-12-24T16:29:38.304 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2021-12-24T16:29:38.304 INFO:tasks.cephfs_test_runner:Traceback (most recent call last): 2021-12-24T16:29:38.304 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs/cephfs_test_case.py", line 212, in tearDown 2021-12-24T16:29:38.304 INFO:tasks.cephfs_test_runner: self.mds_cluster.delete_all_filesystems() 2021-12-24T16:29:38.304 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs/filesystem.py", line 479, in del ete_all_filesystems 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: for fs in self.status().get_filesystems(): 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs/filesystem.py", line 382, in sta tus 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: return FSStatus(self.mon_manager, epoch) 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs/filesystem.py", line 79, in __in it__ 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: self.map = json.loads(self.mon.raw_cluster_cmd(*cmd)) 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/ceph_manager.py", line 1581, in raw_clu ster_cmd 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: p = self.run_cluster_cmd(args=args, stdout=stdout, **kwargs) 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/ceph_manager.py", line 1574, in run_clu ster_cmd 2021-12-24T16:29:38.305 INFO:tasks.cephfs_test_runner: return self.controller.run(**kwargs) 2021-12-24T16:29:38.306 INFO:tasks.cephfs_test_runner: File "/home/teuthology/src/teuthology_pacific/teuthology/orchestra/remote.py", line 509, in run 2021-12-24T16:29:38.306 INFO:tasks.cephfs_test_runner: r = self._runner(client=self.ssh, name=self.shortname, **kwargs) 2021-12-24T16:29:38.306 INFO:tasks.cephfs_test_runner: File "/home/teuthology/src/teuthology_pacific/teuthology/orchestra/run.py", line 455, in run 2021-12-24T16:29:38.306 INFO:tasks.cephfs_test_runner: r.wait() 2021-12-24T16:29:38.306 INFO:tasks.cephfs_test_runner: File "/home/teuthology/src/teuthology_pacific/teuthology/orchestra/run.py", line 161, in wait 2021-12-24T16:29:38.306 INFO:tasks.cephfs_test_runner: self._raise_for_status() 2021-12-24T16:29:38.306 INFO:tasks.cephfs_test_runner: File "/home/teuthology/src/teuthology_pacific/teuthology/orchestra/run.py", line 181, in _raise_for_status 2021-12-24T16:29:38.306 INFO:tasks.cephfs_test_runner: raise CommandFailedError( 2021-12-24T16:29:38.307 INFO:tasks.cephfs_test_runner:teuthology.exceptions.CommandFailedError: Command failed on m1306035 with status 124: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive /coverage timeout 120 ceph --cluster ceph fs dump --format=json' 2021-12-24T16:29:38.307 INFO:tasks.cephfs_test_runner: 2021-12-24T16:29:38.307 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthology/src/teuthology_pacific/teuthology/run_tasks.py", line 94, in run_tasks manager.__enter__() File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__ return next(self.gen) File "/home/teuthworker/src/github.com_ibm-s390-cloud_ceph_c39ba7d47040c91efe2793b55ab9465a9a4ec66b/qa/tasks/cephfs_test_runner.py", line 211, in task raise RuntimeError("Test failure: {0}".format(", ".join(bad_tests))) RuntimeError: Test failure: test_mount_all_caps_absent (tasks.cephfs.test_multifs_auth.TestClientsWithoutAuth) 2021-12-24T16:29:38.307 DEBUG:teuthology.run_tasks:Unwinding manager cephfs_test_runner 2021-12-24T16:29:38.311 DEBUG:teuthology.run_tasks:Unwinding manager kclient 2021-12-24T16:29:38.315 INFO:tasks.kclient:Unmounting kernel clients... 2021-12-24T16:29:38.316 INFO:teuthology.orchestra.run:Running command with timeout 300
The test produces a segfault and the following log in ceph-mon.a.log.gz is:
-15> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 mon.a@1(peon).mgr e4 Sending map to subscriber 0x3ff6c065910 v2:172.18.232.30:6834/2136209486 -14> 2021-12-24T16:27:34.503+0100 3ff937fe900 1 -- [v2:172.18.232.35:3300/0,v1:172.18.232.35:6789/0] --> [v2:172.18.232.30:6834/2136209486,v1:172.18.232.30:6835/2136209486] -- mgrmap(e 4) v1 -- 0x3ff5 c5d1d90 con 0x3ff6c065910 -13> 2021-12-24T16:27:34.503+0100 3ff937fe900 10 mon.a@1(peon).monmap v1 check_sub monmap next 2 have 1 -12> 2021-12-24T16:27:34.503+0100 3ff937fe900 1 -- [v2:172.18.232.35:3300/0,v1:172.18.232.35:6789/0] <== client.5551 v1:192.168.0.1:0/2933524942 3 ==== mon_subscribe({fsmap.user=0,monmap=2+,osdmap=0}) v2 ==== 65+0+0 (unknown 2575153023 0 0) 0x3ff5c4a9ad0 con 0x3ff6c01a3f0 -11> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 mon.a@1(peon) e1 _ms_dispatch existing session 0x3ff6c0834c0 for client.5551 -10> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 mon.a@1(peon) e1 entity_name client.testuser global_id 5551 (reclaim_ok) caps allow r fsname=cephfs -9> 2021-12-24T16:27:34.503+0100 3ff937fe900 10 mon.a@1(peon) e1 handle_subscribe mon_subscribe({fsmap.user=0,monmap=2+,osdmap=0}) v2 -8> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 is_capable service=mon command= read addr v1:192.168.0.1:0/2933524942 on cap allow r -7> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 allow so far , doing grant allow r -6> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 match -5> 2021-12-24T16:27:34.503+0100 3ff937fe900 10 mon.a@1(peon) e1 handle_subscribe: MDS sub 'fsmap.user' -4> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 is_capable service=mds command= read addr v1:192.168.0.1:0/2933524942 on cap allow r -3> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 allow so far , doing grant allow r -2> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 match -1> 2021-12-24T16:27:34.503+0100 3ff937fe900 20 mon.a@1(peon).mds e18 check_sub: fsmap.user 0> 2021-12-24T16:27:34.513+0100 3ff937fe900 -1 *** Caught signal (Segmentation fault) ** in thread 3ff937fe900 thread_name:ms_dispatch ceph version 16.2.6-710-geaff0ba3695 (eaff0ba3695f9a68cd1eda6939be4347a55bf703) pacific (stable) 1: [0x3ff937fa35e] 2: (MDSMonitor::check_sub(Subscription*)+0x2a0) [0x2aa2d55f008] 3: (Monitor::handle_subscribe(boost::intrusive_ptr<MonOpRequest>)+0x13ec) [0x2aa2d2f5314] 4: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x680) [0x2aa2d327d48] 5: (Monitor::_ms_dispatch(Message*)+0x34e) [0x2aa2d328ebe] 6: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x82) [0x2aa2d3647ca] 7: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x484) [0x3ffa6835d14] 8: (DispatchQueue::entry()+0x53c) [0x3ffa683332c] 9: (DispatchQueue::DispatchThread::entry()+0x18) [0x3ffa68fbd98] 10: /lib/s390x-linux-gnu/libpthread.so.0(+0x9986) [0x3ffa6089986] 11: /lib/s390x-linux-gnu/libc.so.6(+0x103cc6) [0x3ffa5b03cc6] 12: [(nil)] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions