Project

General

Profile

Actions

Bug #1435

closed

mds: loss of layout policies upon mds restart

Added by Alexandre Oliva over 12 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Cluster running ceph 0.33 + patch to add support for “ceph mds add_data_pool”.

I set up layout policies for various directories (renamed for privacy) on an otherwise empty cluster:

cephfs a set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 3
cephfs b set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 3
cephfs l/c set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 3
cephfs l/d set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 4
cephfs l/e set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 5
cephfs l/f set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 6
cephfs g set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 7
cephfs h set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 8
cephfs h/i set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 9
cephfs l/j set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 10
cephfs l/k set_layout -u 4194304 -s 4194304 -c 1 -o -1 -p 10

Cluster has 3 mons, 3 mdses, 3 osdes, but one osd is out and only one mds is enabled.

After setting up the crushmap and the policies as above, I ran show_layout for each of the directories, and they were correct.

I started rsyncing, and each file got placed in the correct pool. However, when it started uploading to one of the directories above, it started using the “data” pool number 0. At that point, show_layout showed most (all? not sure) directories had lost their policies.

Reinstating the policies and cleaning up the misplaced files, they started being uploaded to the right place.

At some point I had to restart one of the servers, that hosted the active mds and one of the active osdes. So I turned on another mds, and when it took over, the policies were (partially?) gone. Switching back to the originally active mds, after restart, also caused (partial?) loss of policies. It seems like at every mds change, the layout policies of directories are lost on all clients.


Files

LOCAL-verbose-track-layout.patch (13.6 KB) LOCAL-verbose-track-layout.patch patch that logs layout statuses and changes Alexandre Oliva, 02/16/2013 12:17 PM
cdir-fetch-recover-dir-layout.patch (925 Bytes) cdir-fetch-recover-dir-layout.patch patch that seems to fix the bug Alexandre Oliva, 02/16/2013 06:31 PM
Actions

Also available in: Atom PDF