Actions
Bug #58878
openmds: FAILED ceph_assert(trim_to > trimming_pos)
Status:
New
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Tags:
Backport:
reef,quincy,pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
One of the MDS crash with the following backtrace:
-24> 2023-02-10T03:40:41.099+0000 7f00b8981700 1 mds.ocs-storagecluster-cephfilesystem-b Updating MDS map to version 2590 from mon.1 -23> 2023-02-10T03:40:41.099+0000 7f00b8981700 4 mds.0.2313 set_osd_epoch_barrier: epoch=16688 -22> 2023-02-10T03:40:41.101+0000 7f00bc188700 5 mds.beacon.ocs-storagecluster-cephfilesystem-b received beacon reply up:active seq 785 rtt 0.612017 -21> 2023-02-10T03:40:41.674+0000 7f00b4979700 2 mds.0.cache Memory usage: total 998012, rss 335560, heap 356620, baseline 332044, 2411 / 32894 inodes have caps, 2412 caps, 0.0733264 caps per inode -20> 2023-02-10T03:40:41.793+0000 7f00b8981700 3 mds.0.server handle_client_session client_session(request_renewcaps seq 46677) from client.24866268 -19> 2023-02-10T03:40:41.885+0000 7f00b797f700 10 monclient: tick -18> 2023-02-10T03:40:41.885+0000 7f00b797f700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-02-10T03:40:11.886692+0000) -17> 2023-02-10T03:40:42.068+0000 7f00b8981700 4 mds.0.2313 apply_blocklist: killed 0 blocklisted sessions (0 blocklist entries, 2) -16> 2023-02-10T03:40:42.068+0000 7f00b8981700 10 monclient: _renew_subs -15> 2023-02-10T03:40:42.068+0000 7f00b8981700 10 monclient: _send_mon_message to mon.d at v2:172.30.8.118:3300/0 -14> 2023-02-10T03:40:42.108+0000 7f00b8981700 1 mds.ocs-storagecluster-cephfilesystem-b Updating MDS map to version 2591 from mon.1 -13> 2023-02-10T03:40:42.108+0000 7f00b8981700 4 mds.0.2313 set_osd_epoch_barrier: epoch=16689 -12> 2023-02-10T03:40:42.674+0000 7f00b4979700 2 mds.0.cache Memory usage: total 998012, rss 335560, heap 356620, baseline 332044, 2411 / 32894 inodes have caps, 2412 caps, 0.0733264 caps per inode -11> 2023-02-10T03:40:42.885+0000 7f00b797f700 10 monclient: tick -10> 2023-02-10T03:40:42.885+0000 7f00b797f700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-02-10T03:40:12.886850+0000) -9> 2023-02-10T03:40:43.002+0000 7f00b8981700 1 mds.ocs-storagecluster-cephfilesystem-b Updating MDS map to version 2592 from mon.1 -8> 2023-02-10T03:40:43.002+0000 7f00b8981700 4 mds.0.2313 set_osd_epoch_barrier: epoch=16689 -7> 2023-02-10T03:40:43.675+0000 7f00b4979700 2 mds.0.cache Memory usage: total 998012, rss 335560, heap 356620, baseline 332044, 2411 / 32894 inodes have caps, 2412 caps, 0.0733264 caps per inode -6> 2023-02-10T03:40:43.885+0000 7f00b797f700 10 monclient: tick -5> 2023-02-10T03:40:43.885+0000 7f00b797f700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-02-10T03:40:13.886984+0000) -4> 2023-02-10T03:40:44.185+0000 7f00b8981700 4 mds.0.2313 apply_blocklist: killed 0 blocklisted sessions (0 blocklist entries, 2) -3> 2023-02-10T03:40:44.222+0000 7f00b8981700 4 mds.0.2313 apply_blocklist: killed 0 blocklisted sessions (0 blocklist entries, 2) -2> 2023-02-10T03:40:44.298+0000 7f00b2975700 -1 /builddir/build/BUILD/ceph-16.2.7/src/osdc/Journaler.cc: In function 'void Journaler::_trim()' thread 7f00b2975700 time 2023-02-10T03:40:44.298139+0000 /builddir/build/BUILD/ceph-16.2.7/src/osdc/Journaler.cc: 1358: FAILED ceph_assert(trim_to > trimming_pos) ceph version 16.2.7-126.el8cp (fe0af61d104d48cb9d116cde6e593b5fc8c197e4) pacific (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f00c119acfe] 2: /usr/lib64/ceph/libceph-common.so.2(+0x276f18) [0x7f00c119af18] 3: (Journaler::_trim()+0x807) [0x55e95e8d0447] 4: (Journaler::_finish_write_head(int, Journaler::Header&, C_OnFinisher*)+0x208) [0x55e95e8d0ab8] 5: (Context::complete(int)+0xd) [0x55e95e56a57d] 6: (Finisher::finisher_thread_entry()+0x1a5) [0x7f00c123c1e5] 7: /lib64/libpthread.so.0(+0x81cf) [0x7f00c017e1cf] 8: clone()
No recovery steps or cephfs recovery tools were performed/used on the file system. Logs are limited. Seems like a possible bug in mdlog.
Actions