Project

General

Profile

Bug #53155

MDSMonitor: assertion during upgrade to v16.2.5+

Added by Patrick Donnelly about 1 year ago. Updated 9 months ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
crash, qa, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-11-04T03:09:45.523 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/mds/FSMap.cc: In function 'void FSMap::sanity() const' thread 7f3d20499700 time 2021-11-04T03:09:45.179041+0000
2021-11-04T03:09:45.524 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/mds/FSMap.cc: 845: FAILED ceph_assert(fs->mds_map.compat.compare(compat) == 0)
2021-11-04T03:09:45.524 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  ceph version 16.2.4 (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable)
2021-11-04T03:09:45.524 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f3d2e01259c]
2021-11-04T03:09:45.524 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  2: /usr/lib64/ceph/libceph-common.so.2(+0x2767b6) [0x7f3d2e0127b6]
2021-11-04T03:09:45.525 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  3: (FSMap::sanity() const+0xcd) [0x7f3d2e552bed]
2021-11-04T03:09:45.525 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  4: (MDSMonitor::update_from_paxos(bool*)+0x378) [0x561a8d0c8c28]
2021-11-04T03:09:45.525 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  5: (PaxosService::refresh(bool*)+0x10e) [0x561a8cfea64e]
2021-11-04T03:09:45.525 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  6: (Monitor::refresh_from_paxos(bool*)+0x18c) [0x561a8ce9dd1c]
2021-11-04T03:09:45.526 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  7: (Paxos::do_refresh()+0x57) [0x561a8cfdcbb7]
2021-11-04T03:09:45.526 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  8: (Paxos::handle_commit(boost::intrusive_ptr<MonOpRequest>)+0x309) [0x561a8cfdd049]
2021-11-04T03:09:45.526 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  9: (Paxos::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x457) [0x561a8cfe54f7]
2021-11-04T03:09:45.526 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  10: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x1324) [0x561a8ced8e84]
2021-11-04T03:09:45.527 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  11: (Monitor::_ms_dispatch(Message*)+0x670) [0x561a8ced9910]
2021-11-04T03:09:45.527 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  12: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x5c) [0x561a8cf07fdc]
2021-11-04T03:09:45.527 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  13: (DispatchQueue::entry()+0x126a) [0x7f3d2e24cb1a]
2021-11-04T03:09:45.527 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  14: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f3d2e2fcb71]
2021-11-04T03:09:45.528 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  15: /lib64/libpthread.so.0(+0x814a) [0x7f3d2bb0214a]
2021-11-04T03:09:45.528 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]:  16: clone()
2021-11-04T03:09:45.528 INFO:journalctl@ceph.mon.smithi148.smithi148.stdout:Nov 04 03:09:45 smithi148 conmon[33926]: debug *** Caught signal (Aborted) **

From: /ceph/teuthology-archive/pdonnell-2021-11-04_02:42:02-fs:upgrade-wip-pdonnell-testing-20211103.235257-distro-basic-smithi/6483729/teuthology.log

This will need change to no longer do sanity checks when upgrading. It's no longer the case that the FSMap::compat == MDSMap::compat for all FS.


Related issues

Copied to CephFS - Backport #53231: pacific: MDSMonitor: assertion during upgrade to v16.2.5+ Resolved

History

#1 Updated by Patrick Donnelly about 1 year ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 43800

#2 Updated by Patrick Donnelly about 1 year ago

  • Status changed from Fix Under Review to Pending Backport

#3 Updated by Backport Bot about 1 year ago

  • Copied to Backport #53231: pacific: MDSMonitor: assertion during upgrade to v16.2.5+ added

#4 Updated by Loïc Dachary about 1 year ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF