Project

General

Profile

Actions

Bug #63743

open

MDS crash on ceph_assert(in->is_auth())

Added by Andrea Bolzonella 5 months ago. Updated 4 months ago.

Status:
New
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The MDS crash and continue crash on the same assert.

Nov 15 21:59:44 r07s08 bash[2466508]: debug 2023-11-15T21:59:44.161+0000 7f009da88700  1 mds.fs_name.r07s08.rgbvub Updating MDS map to version 5369671 from mon.3
Nov 15 21:59:44 r07s08 bash[2466508]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.5/rpm/el8/BUILD/ceph-17.2.5/src/mds/Locker.cc: In function 'bool Locker::check_inode_max_size(CInode*, bool, uint64_t, uint64_t, utime_t)' thread 7f009da88700 time 2023-11-15T21:59:44.969703+0000
Nov 15 21:59:44 r07s08 bash[2466508]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.5/rpm/el8/BUILD/ceph-17.2.5/src/mds/Locker.cc: 2807: FAILED ceph_assert(in->is_auth())
Nov 15 21:59:44 r07s08 bash[2466508]:  ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)
Nov 15 21:59:44 r07s08 bash[2466508]:  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f00a66bf43f]
Nov 15 21:59:44 r07s08 bash[2466508]:  2: /usr/lib64/ceph/libceph-common.so.2(+0x269605) [0x7f00a66bf605]
Nov 15 21:59:44 r07s08 bash[2466508]:  3: (Locker::check_inode_max_size(CInode*, bool, unsigned long, unsigned long, utime_t)+0x1328) [0x559940a11178]
Nov 15 21:59:44 r07s08 bash[2466508]:  4: (Server::handle_client_open(boost::intrusive_ptr<MDRequestImpl>&)+0xe51) [0x55994089cbe1]
Nov 15 21:59:44 r07s08 bash[2466508]:  5: (Server::handle_client_openc(boost::intrusive_ptr<MDRequestImpl>&)+0x43f) [0x55994089d89f]
Nov 15 21:59:44 r07s08 bash[2466508]:  6: (MDSContext::complete(int)+0x5f) [0x559940b5f6df]
Nov 15 21:59:44 r07s08 bash[2466508]:  7: (C_MDC_OpenRemoteDentry::finish(int)+0x3e) [0x5599409db96e]
Nov 15 21:59:44 r07s08 bash[2466508]:  8: (MDSContext::complete(int)+0x5f) [0x559940b5f6df]
Nov 15 21:59:44 r07s08 bash[2466508]:  9: (void finish_contexts<std::vector<MDSContext*, std::allocator<MDSContext*> > >(ceph::common::CephContext*, std::vector<MDSContext*, std::allocator<MDSContext*> >&, int)+0x8d) [0x55994080c85d]
Nov 15 21:59:44 r07s08 bash[2466508]:  10: (MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&, int)+0x138) [0x559940988018]
Nov 15 21:59:44 r07s08 bash[2466508]:  11: (MDCache::_open_ino_traverse_dir(inodeno_t, MDCache::open_ino_info_t&, int)+0xbb) [0x55994098836b]
Nov 15 21:59:44 r07s08 bash[2466508]:  12: (MDSContext::complete(int)+0x5f) [0x559940b5f6df]
Nov 15 21:59:44 r07s08 bash[2466508]:  13: (MDSRank::_advance_queues()+0xaa) [0x55994081a29a]
Nov 15 21:59:44 r07s08 bash[2466508]:  14: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x1d8) [0x55994081adc8]
Nov 15 21:59:44 r07s08 bash[2466508]:  15: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x5c) [0x55994081b80c]
Nov 15 21:59:44 r07s08 bash[2466508]:  16: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x1bf) [0x5599408055ff]
Nov 15 21:59:44 r07s08 bash[2466508]:  17: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x478) [0x7f00a6939c88]
Nov 15 21:59:44 r07s08 bash[2466508]:  18: (DispatchQueue::entry()+0x50f) [0x7f00a69370cf]
Nov 15 21:59:44 r07s08 bash[2466508]:  19: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f00a69fe8f1]
Nov 15 21:59:44 r07s08 bash[2466508]:  20: /lib64/libpthread.so.0(+0x81ca) [0x7f00a56af1ca]
Nov 15 21:59:44 r07s08 bash[2466508]:  21: clone()

The issue was fixed by evicting clients.
Please noice that most of the clients are behind a floting ip.
Attached the entire dump.


Files

mds_crush_dump_ob.log.gz (572 KB) mds_crush_dump_ob.log.gz Andrea Bolzonella, 12/06/2023 02:57 PM
fs_dump.json (9.58 KB) fs_dump.json Andrea Bolzonella, 01/09/2024 12:15 PM
Actions #1

Updated by Milind Changire 5 months ago

  • Assignee set to Milind Changire
Actions #2

Updated by Milind Changire 4 months ago

Andrea,
could you attach output of 'ceph fs dump --format=json'

Actions #3

Updated by Andrea Bolzonella 4 months ago

Hi
attached the dump.
It may be slightly different from what it was on November 15th.

Andrea

Actions

Also available in: Atom PDF