Project

General

Profile

Actions

Bug #58878

open

mds: FAILED ceph_assert(trim_to > trimming_pos)

Added by Venky Shankar about 1 year ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
Backport:
reef,quincy,pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

One of the MDS crash with the following backtrace:

   -24> 2023-02-10T03:40:41.099+0000 7f00b8981700  1 mds.ocs-storagecluster-cephfilesystem-b Updating MDS map to version 2590 from mon.1
   -23> 2023-02-10T03:40:41.099+0000 7f00b8981700  4 mds.0.2313 set_osd_epoch_barrier: epoch=16688
   -22> 2023-02-10T03:40:41.101+0000 7f00bc188700  5 mds.beacon.ocs-storagecluster-cephfilesystem-b received beacon reply up:active seq 785 rtt 0.612017
   -21> 2023-02-10T03:40:41.674+0000 7f00b4979700  2 mds.0.cache Memory usage:  total 998012, rss 335560, heap 356620, baseline 332044, 2411 / 32894 inodes have caps, 2412 caps, 0.0733264 caps per inode
   -20> 2023-02-10T03:40:41.793+0000 7f00b8981700  3 mds.0.server handle_client_session client_session(request_renewcaps seq 46677) from client.24866268
   -19> 2023-02-10T03:40:41.885+0000 7f00b797f700 10 monclient: tick
   -18> 2023-02-10T03:40:41.885+0000 7f00b797f700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-02-10T03:40:11.886692+0000)
   -17> 2023-02-10T03:40:42.068+0000 7f00b8981700  4 mds.0.2313 apply_blocklist: killed 0 blocklisted sessions (0 blocklist entries, 2)
   -16> 2023-02-10T03:40:42.068+0000 7f00b8981700 10 monclient: _renew_subs
   -15> 2023-02-10T03:40:42.068+0000 7f00b8981700 10 monclient: _send_mon_message to mon.d at v2:172.30.8.118:3300/0
   -14> 2023-02-10T03:40:42.108+0000 7f00b8981700  1 mds.ocs-storagecluster-cephfilesystem-b Updating MDS map to version 2591 from mon.1
   -13> 2023-02-10T03:40:42.108+0000 7f00b8981700  4 mds.0.2313 set_osd_epoch_barrier: epoch=16689
   -12> 2023-02-10T03:40:42.674+0000 7f00b4979700  2 mds.0.cache Memory usage:  total 998012, rss 335560, heap 356620, baseline 332044, 2411 / 32894 inodes have caps, 2412 caps, 0.0733264 caps per inode
   -11> 2023-02-10T03:40:42.885+0000 7f00b797f700 10 monclient: tick
   -10> 2023-02-10T03:40:42.885+0000 7f00b797f700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-02-10T03:40:12.886850+0000)
    -9> 2023-02-10T03:40:43.002+0000 7f00b8981700  1 mds.ocs-storagecluster-cephfilesystem-b Updating MDS map to version 2592 from mon.1
    -8> 2023-02-10T03:40:43.002+0000 7f00b8981700  4 mds.0.2313 set_osd_epoch_barrier: epoch=16689
    -7> 2023-02-10T03:40:43.675+0000 7f00b4979700  2 mds.0.cache Memory usage:  total 998012, rss 335560, heap 356620, baseline 332044, 2411 / 32894 inodes have caps, 2412 caps, 0.0733264 caps per inode
    -6> 2023-02-10T03:40:43.885+0000 7f00b797f700 10 monclient: tick
    -5> 2023-02-10T03:40:43.885+0000 7f00b797f700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-02-10T03:40:13.886984+0000)
    -4> 2023-02-10T03:40:44.185+0000 7f00b8981700  4 mds.0.2313 apply_blocklist: killed 0 blocklisted sessions (0 blocklist entries, 2)
    -3> 2023-02-10T03:40:44.222+0000 7f00b8981700  4 mds.0.2313 apply_blocklist: killed 0 blocklisted sessions (0 blocklist entries, 2)
    -2> 2023-02-10T03:40:44.298+0000 7f00b2975700 -1 /builddir/build/BUILD/ceph-16.2.7/src/osdc/Journaler.cc: In function 'void Journaler::_trim()' thread 7f00b2975700 time 2023-02-10T03:40:44.298139+0000
/builddir/build/BUILD/ceph-16.2.7/src/osdc/Journaler.cc: 1358: FAILED ceph_assert(trim_to > trimming_pos)

 ceph version 16.2.7-126.el8cp (fe0af61d104d48cb9d116cde6e593b5fc8c197e4) pacific (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f00c119acfe]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x276f18) [0x7f00c119af18]
 3: (Journaler::_trim()+0x807) [0x55e95e8d0447]
 4: (Journaler::_finish_write_head(int, Journaler::Header&, C_OnFinisher*)+0x208) [0x55e95e8d0ab8]
 5: (Context::complete(int)+0xd) [0x55e95e56a57d]
 6: (Finisher::finisher_thread_entry()+0x1a5) [0x7f00c123c1e5]
 7: /lib64/libpthread.so.0(+0x81cf) [0x7f00c017e1cf]
 8: clone()

No recovery steps or cephfs recovery tools were performed/used on the file system. Logs are limited. Seems like a possible bug in mdlog.


Related issues 1 (1 open0 closed)

Related to CephFS - Feature #64101: tools/cephfs: toolify updating mdlog journal pointers to a sane valueNewMilind Changire

Actions
Actions #1

Updated by Venky Shankar about 1 year ago

  • Status changed from New to Can't reproduce

This was suspected due to various metadata inconsistencies which probably surfaced due to destructive tools being run inappropriately.

Please reopen if this is seen again.

Actions #2

Updated by Jos Collin 7 months ago

  • Status changed from Can't reproduce to New
  • Target version deleted (v18.0.0)
Actions #3

Updated by Venky Shankar 7 months ago

  • Target version set to v19.0.0
  • Backport changed from pacific,quincy to reef,quincy,pacific

Reopned for investigating a possible bug in the MDS that causes bogus values to be persisted in the journal header.

Actions #4

Updated by Venky Shankar 6 months ago

Greg pointed out that this tracker lacks details of why it got reopened, so here are some of the details that happened in various other channels (slack, etc..):

I suspect Journaler::write_head() being invoked concurrently, and the journal positions getting mangled based on which RADOS call returns first. If you see Journaler::_finish_write_head():

void Journaler::_finish_write_head(int r, Header &wrote,
                                   C_OnFinisher *oncommit)
{
  lock_guard l(lock);

  if (r < 0) {
    lderr(cct) << "_finish_write_head got " << cpp_strerror(r) << dendl;
    handle_write_error(r);
    return;
  }
  ceph_assert(!readonly);
  ldout(cct, 10) << "_finish_write_head " << wrote << dendl;
  last_committed = wrote;
  if (oncommit) {
    oncommit->complete(r);
  }

  _trim();  // trim?
}

Maybe just to be safe and catch this particular case (out-of-order reply), _finish_write_head() can (derr) log when wrote < last_committed.

Greg mentioned that the journal ordering has been audited a few times in the past especially for cases this this to be certain that its not broken, however, its possible that some rules have been broken when it got moved to its own thread.

Also, just FYI - The case which prompted reopening this tracker had Patrick suggesting fixing up the journal positions manually, but that's a bit tricky since the read position should point to the SUBTREEMAP event rather than the beginning of the object. We should have tools to do this rather than doing it manually.

Actions #5

Updated by Venky Shankar 3 months ago

  • Related to Feature #64101: tools/cephfs: toolify updating mdlog journal pointers to a sane value added
Actions #6

Updated by Venky Shankar 3 months ago

debug change to possibly catch out-of-rder journal pointer update: https://github.com/ceph/ceph/pull/55265

Actions

Also available in: Atom PDF