Project

General

Profile

Bug #4894

mds: standby shut itself down due to not having any data

Added by Greg Farnum almost 11 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2013-05-02 02:38:31.094956 7f0496fce700  1 mds.0.2 rejoin_done
2013-05-02 02:38:31.094958 7f0496fce700 10 mds.0.cache show_subtrees - no subtrees
2013-05-02 02:38:31.094963 7f0496fce700  7 mds.0.cache show_cache
2013-05-02 02:38:31.094965 7f0496fce700  7 mds.0.cache  unlinked [inode 1 [...2,head] / auth v1 snaprealm=0x2f07000 f(v0 1=0+1) n(v0 1=0+1) (iversion lock) 0x2f14860]
2013-05-02 02:38:31.094971 7f0496fce700  7 mds.0.cache  unlinked [inode 100 [...2,head] ~mds0/ auth v1 snaprealm=0x2f076c0 f(v0 11=1+10) n(v0 11=1+10) (iversion lock) 0x2f14000]
2013-05-02 02:38:31.094978 7f0496fce700  1 mds.0.2  empty cache, no subtrees, leaving cluster
2013-05-02 02:38:31.094979 7f0496fce700  3 mds.0.2 request_state down:stopped
2013-05-02 02:38:31.094981 7f0496fce700 10 mds.0.2 beacon_send down:stopped seq 13 (currently up:rejoin)

I'm in the process of pulling logs into /a/teuthology-2013-05-02_01:00:52-fs-next-testing-basic/5452, but they are sadly going to be incomplete — I have the standby, the OSDs (dunno what's on them), and the monitor logs, but the active MDS log is totally empty.

Associated revisions

Revision 4fd34bef (diff)
Added by Sage Weil over 10 years ago

mds: create only one ESubtreeMap during fs creation

Previously we would create an empty ESubtreeMap when we opened the log
segment and then immediately journal a second one that created the root
and mdsdir. More importantly, for the second ESubtreeMap, we would not
wait for it to commit before requesting the ACTIVE state, leading to
#4894.

Instead, break start_new_segment() into two steps: one that creates the
in-memory LogSegment tracking structure, and one that journals the
ESubtreeMap. Open things early and write the (one) ESubtreeMap at the
end of boot_create().. and then wait for it.

Fixes: #4894
Signed-off-by: Sage Weil <>
Reviewed-by: Yan, Zheng <>

History

#1 Updated by Zheng Yan almost 11 years ago

I think MDS::boot_create() should start a new log segment after creating the fs hierarchy.

#2 Updated by Anonymous almost 11 years ago

  • Priority changed from Normal to High

#3 Updated by Greg Farnum almost 11 years ago

You must be racing ahead of me here, Yan — what's your theory? Just that the first active MDS failed to write any log out to disk prior to reporting as active to the monitors?

#4 Updated by Zheng Yan almost 11 years ago

MDS::boot_create() first starts a new log segment (its ESubtreemap is empty), then use MDCache::create_empty_hierarchy() to create the subtree dir fragment /. The problem is we don't have a way to express subtree tree creation in the MDS log. So the MDS has no subtree after replaying the log. I think the simplest fix is start another log segment after creating the dir fragment /. The second log segment's ESubtreemap will record the newly created subtrees.

#5 Updated by Sage Weil over 10 years ago

  • Status changed from New to Fix Under Review

wip-4894

saw this again in ubuntu@teuthology:/a/teuthology-2013-08-15_20:01:04-fs-cuttlefish-testing-basic-plana/108749

#6 Updated by Sage Weil over 10 years ago

  • Status changed from Fix Under Review to Resolved

#7 Updated by Greg Farnum over 7 years ago

  • Component(FS) MDS added

Also available in: Atom PDF