Bug #52094: Tried out Quincy: All MDS Standby - CephFS - Ceph

Actions

Copy link

Bug #52094

closed

Tried out Quincy: All MDS Standby

Added by Joshua West over 2 years ago. Updated over 2 years ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Patrick Donnelly

Category:

Testing

Target version:

Ceph - v17.0.0

% Done:

Source:

Development

Tags:

mds standy, quincy

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v17.0.0

ceph-qa-suite:

Component(FS):

MDS, MDSMonitor, mgr/mds_autoscaler

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

On Proxmox, and suffering with #51445 (https://tracker.ceph.com/issues/51445)
As any good "Knows enough to be dangerous" sorta folk, I attempted to implement (https://github.com/ceph/ceph/pull/42345) but foolishly learned a lesson about accidentally upgrading oneself to the dev branch.


ceph version 17.0.0-6673-g313be835f7a (313be835f7a5eb5b2e43365d044bf20fd3fd1b2d) quincy (dev)

For the most part, things went smooth, but I beleive I may have identified a bug with regard to none of my four MDS coming out of standby.
In a perfect world, I would get past the bug(?), or figure out how to revert to pacific without the mons being unable to start due to "changes to the on disk structure".
health: HEALTH_ERR 1 filesystem is degraded 1 filesystem has a failed mds daemon 2 large omap objects 1 filesystem is offline


/var/log/ceph/ceph-mon.server.log:flags    32 joinable allow_snaps allow_multimds_snaps allow_standby_replay
/var/log/ceph/ceph-mon.server.log:compat    compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
/var/log/ceph/ceph-mon.server.log:max_mds    1
/var/log/ceph/ceph-mon.server.log:[mds.rd240{-1:136865003} state up:standby seq 1 addr [v2:192.168.2.20:6864/2702233770,v1:192.168.2.20:6865/2702233770] compat {c=[1],r=[1],i=[7ff]}]
/var/log/ceph/ceph-mon.server.log:[mds.server{ffffffff:8337f9f} state up:standby seq 1 addr [v2:192.168.2.2:1a90/7b467d7,v1:192.168.2.2:1a91/7b467d7] compat {c=[1],r=[1],i=[7ff]}]
/var/log/ceph/ceph-mon.server.log:[mds.dl380g7{ffffffff:8338b3b} state up:standby seq 1 addr [v2:192.168.2.4:1a90/514c93ee,v1:192.168.2.4:1a91/514c93ee] compat {c=[1],r=[1],i=[7ff]}]
/var/log/ceph/ceph-mon.server.log:[mds.rog{ffffffff:8338ebf} state up:standby seq 1 addr [v2:192.168.2.6:1a90/5b0fbd32,v1:192.168.2.6:1a91/5b0fbd32] compat {c=[1],r=[1],i=[7ff]}]

mds logs attached for review, but this is the first issue I've opened (at any tracker), so please let me know if I've missed key details or anything which I should've included (My apologies if this is the case.)
Marked as minor due to this being Quincy, where issues are expected during dev.

Files

Download all files

ceph-mds.rog.log (854 KB) ceph-mds.rog.log		Joshua West, 08/07/2021 12:05 AM
ceph-mds.rog.log2.gz (101 KB) ceph-mds.rog.log2.gz		Joshua West, 08/07/2021 10:41 AM

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Joshua West over 2 years ago


e1095528
enable_multiple, ever_enabled_multiple: 1,1
default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
legacy client fscid: 7

Filesystem 'cephfs' (7)
fs_name    cephfs
epoch    1095528
flags    32 joinable allow_snaps allow_multimds_snaps allow_standby_replay
created    2020-10-27T10:52:01.629171-0600
modified    2021-08-08T09:34:02.509077-0600
tableserver    0
root    0
session_timeout    60
session_autoclose    300
max_file_size    5000000000000
required_client_features    {8=mimic}
last_failure    0
last_failure_osd_epoch    1093846
compat    compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds    2
in    0,1
up    {}
failed    0,1
damaged    
stopped    2
data_pools    [21]
metadata_pool    22
inline_data    disabled
balancer    
standby_count_wanted    1

Standby daemons:

[mds.server{-1:138437405} state up:standby seq 1 join_fscid=7 addr [v2:192.168.2.2:6808/2491038724,v1:192.168.2.2:6809/2491038724] compat {c=[1],r=[1],i=[7ff]}]
[mds.rog{ffffffff:84087bb} state up:standby seq 1 join_fscid=7 addr [v2:192.168.2.6:1a90/d2617aac,v1:192.168.2.6:1a91/d2617aac] compat {c=[1],r=[1],i=[7ff]}]
[mds.dl380g7{ffffffff:84087ef} state up:standby seq 1 join_fscid=7 addr [v2:192.168.2.4:1ac8/abd490c,v1:192.168.2.4:1ac9/abd490c] compat {c=[1],r=[1],i=[7ff]}]
[mds.rd240{ffffffff:840883b} state up:standby seq 1 join_fscid=7 addr [v2:192.168.2.20:1a90/2cf0e9ea,v1:192.168.2.20:1a91/2cf0e9ea] compat {c=[1],r=[1],i=[7ff]}]
dumped fsmap epoch 1095528

Actions

Copy link