Actions
Bug #57147
openqa: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) failure
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
The teuthology link https://pulpito.ceph.com/yuriw-2022-08-11_16:57:01-fs-wip-yuri3-testing-2022-08-11-0809-pacific-distro-default-smithi/6968267
The mds didn't become healthy and the test timed out.
2022-08-11T23:44:41.536 INFO:tasks.cephfs_test_runner:====================================================================== 2022-08-11T23:44:41.536 INFO:tasks.cephfs_test_runner:ERROR: test_full_fsync (tasks.cephfs.test_full.TestClusterFull) 2022-08-11T23:44:41.536 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2022-08-11T23:44:41.537 INFO:tasks.cephfs_test_runner:Traceback (most recent call last): 2022-08-11T23:44:41.537 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph-c_eb4319a2b19ca3fba01742173e97dd5b50b2f291/qa/tasks/cephfs/test_full.py", line 395, in setUp 2022-08-11T23:44:41.537 INFO:tasks.cephfs_test_runner: super(TestClusterFull, self).setUp() 2022-08-11T23:44:41.537 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph-c_eb4319a2b19ca3fba01742173e97dd5b50b2f291/qa/tasks/cephfs/test_full.py", line 32, in setUp 2022-08-11T23:44:41.538 INFO:tasks.cephfs_test_runner: CephFSTestCase.setUp(self) 2022-08-11T23:44:41.538 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph-c_eb4319a2b19ca3fba01742173e97dd5b50b2f291/qa/tasks/cephfs/cephfs_test_case.py", line 169, in setUp 2022-08-11T23:44:41.538 INFO:tasks.cephfs_test_runner: self.fs.wait_for_daemons() 2022-08-11T23:44:41.539 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph-c_eb4319a2b19ca3fba01742173e97dd5b50b2f291/qa/tasks/cephfs/filesystem.py", line 1108, in wait_for_daemons 2022-08-11T23:44:41.539 INFO:tasks.cephfs_test_runner: raise RuntimeError("Timed out waiting for MDS daemons to become healthy") 2022-08-11T23:44:41.539 INFO:tasks.cephfs_test_runner:RuntimeError: Timed out waiting for MDS daemons to become healthy 2022-08-11T23:44:41.539 INFO:tasks.cephfs_test_runner: 2022-08-11T23:44:41.540 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
I think the osd which backed the mds crashed causing mds to stuck in up:creating state.
ceph version 16.2.10-668-geb4319a2 (eb4319a2b19ca3fba01742173e97dd5b50b2f291) pacific (stable) 1: /lib64/libpthread.so.0(+0x12b20) [0x7f19659eeb20] 2: gsignal() 3: abort() 4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1b6) [0x556d542ce711] 5: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x198) [0x556d5477b758] 6: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2a8) [0x556d5477d8f8] 7: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x52) [0x556d545ad242] 8: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x5de) [0x556d545509fe] 9: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309) [0x556d543d7b39] 10: (ceph::osd::scheduler::PGRecoveryMsg::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x556d54637328] 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xc28) [0x556d543f51b8] 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x556d54a74a64] 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x556d54a77944] 14: /lib64/libpthread.so.0(+0x814a) [0x7f19659e414a] 15: clone()
The crash log can be found on teuthology at `/a/yuriw-2022-08-11_16:57:01-fs-wip-yuri3-testing-2022-08-11-0809-pacific-distro-default-smithi/6968267/remote/smithi163/crash/posted/2022-08-11T23:56:04.325332Z_e45de76b-08f9-4145-bc3c-5dd9acb3942d`
Actions