Project

General

Profile

Actions

Bug #52094

closed

Tried out Quincy: All MDS Standby

Added by Joshua West almost 3 years ago. Updated over 2 years ago.

Status:
Duplicate
Priority:
Normal
Category:
Testing
Target version:
% Done:

0%

Source:
Development
Tags:
mds standy, quincy
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, MDSMonitor, mgr/mds_autoscaler
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On Proxmox, and suffering with #51445 (https://tracker.ceph.com/issues/51445)
As any good "Knows enough to be dangerous" sorta folk, I attempted to implement (https://github.com/ceph/ceph/pull/42345) but foolishly learned a lesson about accidentally upgrading oneself to the dev branch.


ceph version 17.0.0-6673-g313be835f7a (313be835f7a5eb5b2e43365d044bf20fd3fd1b2d) quincy (dev)

For the most part, things went smooth, but I beleive I may have identified a bug with regard to none of my four MDS coming out of standby.
In a perfect world, I would get past the bug(?), or figure out how to revert to pacific without the mons being unable to start due to "changes to the on disk structure".
health: HEALTH_ERR 1 filesystem is degraded 1 filesystem has a failed mds daemon 2 large omap objects 1 filesystem is offline

/var/log/ceph/ceph-mon.server.log:flags 32 joinable allow_snaps allow_multimds_snaps allow_standby_replay /var/log/ceph/ceph-mon.server.log:compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} /var/log/ceph/ceph-mon.server.log:max_mds 1 /var/log/ceph/ceph-mon.server.log:[mds.rd240{-1:136865003} state up:standby seq 1 addr [v2:192.168.2.20:6864/2702233770,v1:192.168.2.20:6865/2702233770] compat {c=[1],r=[1],i=[7ff]}] /var/log/ceph/ceph-mon.server.log:[mds.server{ffffffff:8337f9f} state up:standby seq 1 addr [v2:192.168.2.2:1a90/7b467d7,v1:192.168.2.2:1a91/7b467d7] compat {c=[1],r=[1],i=[7ff]}] /var/log/ceph/ceph-mon.server.log:[mds.dl380g7{ffffffff:8338b3b} state up:standby seq 1 addr [v2:192.168.2.4:1a90/514c93ee,v1:192.168.2.4:1a91/514c93ee] compat {c=[1],r=[1],i=[7ff]}] /var/log/ceph/ceph-mon.server.log:[mds.rog{ffffffff:8338ebf} state up:standby seq 1 addr [v2:192.168.2.6:1a90/5b0fbd32,v1:192.168.2.6:1a91/5b0fbd32] compat {c=[1],r=[1],i=[7ff]}]

mds logs attached for review, but this is the first issue I've opened (at any tracker), so please let me know if I've missed key details or anything which I should've included (My apologies if this is the case.)
Marked as minor due to this being Quincy, where issues are expected during dev.


Files

ceph-mds.rog.log (854 KB) ceph-mds.rog.log Joshua West, 08/07/2021 12:05 AM
ceph-mds.rog.log2.gz (101 KB) ceph-mds.rog.log2.gz Joshua West, 08/07/2021 10:41 AM

Related issues 1 (0 open1 closed)

Is duplicate of CephFS - Bug #52975: MDSMonitor: no active MDS after cluster deploymentResolvedPatrick Donnelly

Actions
Actions

Also available in: Atom PDF