Bug #893: no filesystem created if all mdses are configured for standby-replay - Ceph - Ceph

Actions

Copy link

Bug #893

closed

no filesystem created if all mdses are configured for standby-replay

Added by Alexandre Oliva about 13 years ago. Updated about 13 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Greg Farnum

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

If the [mds] section contains:

mds standby replay = true

then, once nodes are started after mkcephfs, all of the mdses will log that they failed to create '/' because it already exists and go into standby, never moving to “creating”. Dropping this setting from a single mds will get the filesystem created properly, and then the setting can be re-enabled.

Ideally, it should not fail to create the filesystem just because all mdses are configured for standby-replay, or this should be documented and the failure mode should be a bit less confusing. It looked like a major regression to me, and I almost went back to 0.24.3 to re-create the filesystem.

Actions

Copy link

Updated by Greg Farnum about 13 years ago

Hmm. Is http://ceph.newdream.net/wiki/Standby-replay_modes not clear enough?

mds standby replay

If this is set, then on startup the MDS will ask the monitor to make it a standby-replay for an active MDS. You can set this flag independently of specifying an MDS to follow; if you do so, the monitor will try to assign it to follow an MDS which has no standby-replay followers. If the monitor can't find an MDS without a follower, an MDS in this mode will remain in standby mode until the monitor finds one.

Actions

Copy link

Updated by Alexandre Oliva about 13 years ago

That is quite clear. What gave me incorrect expectations is that IIRC it started the cluster successfully from a full-stop scenario, picking one of the standby-replay nodes to become active, except when the filesystem hadn't been created yet. Whether or not it's a bug, IMHO it would be desirable for a standby-replay node to take over the creation of the filesystem if no other mdses are available.

Actions

Copy link

Updated by Sage Weil about 13 years ago

It sounds like the monitor needs to make the mds as up:creating or up:starting (or up:replay) if the cluster isn't yet complete or is failed. Only if the cluster is complete should it leave the mds in standby...

Actions

Copy link

Updated by Greg Farnum about 13 years ago

Well, the intention was that if you specified standby-replay that meant you didn't want it going active unless the MDS it was following died. This way you can set your most powerful machine as the main MDS, etc.

If you have multiple MDSes and want one to be active and the others to be in standby/standby-replay, you can pick one of them to be the startup node and set it as "mds standby for rank = 0" (without an "mds standby replay" setting) and it will start up the cluster and the other MDSes will be standby or standby-replay. If your initial MDS then crashes then on restart it will go into standby-replay for the new leader.

We could set up some kind of timeout-based thing for making a standby-replay MDS an active MDS, I suppose, but I don't want to just take the first MDS to report to the monitor as an active MDS because then the assignment of roles depends on boot order and overrides the config file...

Actions

Copy link

Updated by Greg Farnum about 13 years ago

Assignee set to Greg Farnum

Actions

Copy link

Updated by Greg Farnum about 13 years ago

Status changed from New to Resolved

Didn't test it, but I think 227ff6e37a2a905ebf3ded3cf6d3744d68e3f0c3 should take care of this. :)

Actions

Copy link

Updated by Alexandre Oliva about 13 years ago

'fraid this didn't quite work. Just tried creating a new filesystem with 0.26. An MDS marked as standby-reply does indeed get promoted to “creating”, but it crashes before completing the job, and other MDSes don't get another shot at creating the filesystem.

Actions

Copy link

Updated by Greg Farnum about 13 years ago

Can you post the backtrace? We fixed a few bugs with standby-replay in the master branch already.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #893

no filesystem created if all mdses are configured for standby-replay

Updated by Greg Farnum about 13 years ago

Updated by Alexandre Oliva about 13 years ago

Updated by Sage Weil about 13 years ago

Updated by Greg Farnum about 13 years ago

Updated by Greg Farnum about 13 years ago

Updated by Greg Farnum about 13 years ago

Updated by Alexandre Oliva about 13 years ago

Updated by Greg Farnum about 13 years ago