Actions
Bug #54271
openmds/OpenFileTable.cc: 777: FAILED ceph_assert(omap_num_objs == num_objs)
% Done:
0%
Source:
Community (user)
Tags:
Backport:
quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
MDS Logs
---
2020-05-01 10:07:35.559 7eff10cc3700 1 mds.prdceph01 9: 'ceph' 2020-05-01 10:07:35.559 7eff10cc3700 1 mds.prdceph01 respawning with exe /usr/bin/ceph-mds 2020-05-01 10:07:35.559 7eff10cc3700 1 mds.prdceph01 exe_path /proc/self/exe 2020-05-01 10:07:50.785 7fbff66291c0 0 ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable), process ceph-mds, pid 9710 2020-05-01 10:07:50.787 7fbff66291c0 0 pidfile_write: ignore empty --pid-file 2020-05-01 10:07:50.817 7fbfe4408700 1 mds.prdceph01 Updating MDS map to version 1487238 from mon.2 2020-05-01 10:07:55.820 7fbfe4408700 1 mds.prdceph01 Updating MDS map to version 1487239 from mon.2 2020-05-01 10:07:55.820 7fbfe4408700 1 mds.prdceph01 Map has assigned me to become a standby 2020-05-01 10:11:07.369 7fbfe4408700 1 mds.prdceph01 Updating MDS map to version 1487282 from mon.2 2020-05-01 10:11:07.373 7fbfe4408700 1 mds.0.1487282 handle_mds_map i am now mds.0.1487282 2020-05-01 10:11:07.373 7fbfe4408700 1 mds.0.1487282 handle_mds_map state change up:boot --> up:replay 2020-05-01 10:11:07.373 7fbfe4408700 1 mds.0.1487282 replay_start 2020-05-01 10:11:07.374 7fbfe4408700 1 mds.0.1487282 recovery set is 2020-05-01 10:11:07.374 7fbfe4408700 1 mds.0.1487282 waiting for osdmap 2198096 (which blacklists prior instance) 2020-05-01 10:11:09.517 7fbfdd3fa700 0 mds.0.cache creating system inode with ino:0x100 2020-05-01 10:11:09.518 7fbfdd3fa700 0 mds.0.cache creating system inode with ino:0x1 2020-05-01 10:11:09.753 7fbfddbfb700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/OpenFileTable.cc: In function 'void OpenFileTable::_load_finish(int, int, int, unsigned int, bool, bool, ceph::bufferlist&, std::map<std::basic_string<char>, ceph::buffer::v14_2_0::list>&)' thread 7fbfddbfb700 time 2020-05-01 10:11:09.752945 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/OpenFileTable.cc: 777: FAILED ceph_assert(omap_num_objs == num_objs) ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7fbfed691875] 2: (()+0x253a3d) [0x7fbfed691a3d] 3: (OpenFileTable::_load_finish(int, int, int, unsigned int, bool, bool, ceph::buffer::v14_2_0::list&, std::map<std::string, ceph::buffer::v14_2_0::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::v14_2_0::list> > ... &)+0x1bb0) [0x563c26b1e8b0] 4: (C_IO_OFT_Load::finish(int)+0x3b) [0x563c26b2467b] 5: (MDSContext::complete(int)+0x74) [0x563c26afbce4] 6: (MDSIOContextBase::complete(int)+0x177) [0x563c26afbf47] 7: (Finisher::finisher_thread_entry()+0x16f) [0x7fbfed71a44f] 8: (()+0x7e65) [0x7fbfeb551e65] 9: (clone()+0x6d) [0x7fbfea1ff88d]
Updated by Venky Shankar about 2 years ago
This seems to be happening in nautilus. Should check if it can be hit in master (quincy, etc.).
Updated by Venky Shankar about 2 years ago
Just FYI - the workaround is to remove the mds<>_openfiles objects from the metadata pool.
Updated by Xiubo Li about 2 years ago
- Related to Bug #54253: Avoid OOM exceeding 10x MDS cache limit on restart after many files were opened added
Updated by Venky Shankar about 2 years ago
- Status changed from New to Triaged
- Assignee set to Kotresh Hiremath Ravishankar
Updated by Kotresh Hiremath Ravishankar over 1 year ago
- Description updated (diff)
Updated by Kotresh Hiremath Ravishankar over 1 year ago
- Priority changed from High to Low
Lowering the priority as this is seen only in nautilus and not seen in supported versions.
Updated by Kotresh Hiremath Ravishankar over 1 year ago
We will wait for this to happen in recent versions.
Actions