Bug #58489: mds stuck in 'up:replay' and crashed. - CephFS - Ceph

Actions

Copy link

Bug #58489

closed

mds stuck in 'up:replay' and crashed.

Added by Kotresh Hiremath Ravishankar over 1 year ago. Updated 10 months ago.

Status:

Resolved

Priority:

Normal

Assignee:

Xiubo Li

Category:

Correctness/Safety

Target version:

Ceph - v18.0.0

% Done:

Source:

Community (user)

Tags:

backport_processed

Backport:

reef,pacific,quincy

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v17.2.5

ceph-qa-suite:

Component(FS):

MDS

Labels (FS):

crash

Pull request ID:

49970

Crash signature (v1):

Crash signature (v2):

Description

The issue is reported by upstream community user.

The cluster had two filesystems and the active mds of both the filesystems were stuck in 'up:replay'.
This was the case for around 2 days. Later, one of the active mds (stuck in up:replay) state crashed
with below stack trace.

/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.5/rpm/el8/BUILD/ceph-17.2.5/src/mds/journal.cc:
In function 'void EMetaBlob::replay(MDSRank*, LogSegment*,
MDPeerUpdate*)' thread 7fccc7153700 time 2023-01-17T10:05:15.420191+0000
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.5/rpm/el8/BUILD/ceph-17.2.5/src/mds/journal.cc:
1625: FAILED ceph_assert(g_conf()->mds_wipe_sessions)

  ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy
(stable)
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x135) [0x7fccd759943f]
  2: /usr/lib64/ceph/libceph-common.so.2(+0x269605) [0x7fccd7599605]
  3: (EMetaBlob::replay(MDSRank*, LogSegment*, MDPeerUpdate*)+0x5e5c)
[0x55fb2b98e89c]
  4: (EUpdate::replay(MDSRank*)+0x40) [0x55fb2b98f5a0]
  5: (MDLog::_replay_thread()+0x9b3) [0x55fb2b915443]
  6: (MDLog::ReplayThread::entry()+0x11) [0x55fb2b5d1e31]
  7: /lib64/libpthread.so.0(+0x81ca) [0x7fccd65891ca]
  8: clone()

The upstream communication can be found at https://www.spinics.net/lists/ceph-users/msg75472.html

Files

Download all files

mds01.ceph04.logaa.bz2 (879 KB) mds01.ceph04.logaa.bz2		Thomas Widhalm, 01/19/2023 01:15 PM
mds01.ceph04.logab.bz2 (756 KB) mds01.ceph04.logab.bz2		Thomas Widhalm, 01/19/2023 01:15 PM
mds01.ceph06.log.bz2 (681 KB) mds01.ceph06.log.bz2		Thomas Widhalm, 01/19/2023 01:15 PM

Related issues 6 (1 open — 5 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #58489

mds stuck in 'up:replay' and crashed.

Updated by Xiubo Li over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Thomas Widhalm over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Thomas Widhalm over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Thomas Widhalm over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Thomas Widhalm over 1 year ago

Updated by Venky Shankar over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Xiubo Li over 1 year ago

Updated by Venky Shankar about 1 year ago

Updated by Backport Bot about 1 year ago

Updated by Backport Bot about 1 year ago

Updated by Backport Bot about 1 year ago

Updated by Laura Flores about 1 year ago

Updated by Venky Shankar about 1 year ago

Updated by Xiubo Li about 1 year ago

Updated by Xiubo Li about 1 year ago

Updated by Xiubo Li 10 months ago

Updated by Venky Shankar 8 months ago

Updated by Venky Shankar 8 months ago

Updated by Venky Shankar 7 months ago