Project

General

Profile

Actions

Bug #26966

closed

nfs-ganesha: epochs out of sync in cluster

Added by Jeff Layton over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, Ganesha FSAL
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After starting a brand new nfs-ganesha cluster, I found the following set of recovery dbs:

$ rados -p nfs-ganesha ls
rec-0000000000000002:a
rec-0000000000000003:c
rec-0000000000000003:b
grace

$ ganesha-rados-grace dump
cur=3 rec=0
======================================================
a      
b      
c      

Two of them are at epoch 3, and the other is at epoch 2. I believe this probably happened because these hosts had no active clients, and so the last node needing a grace period cleared its flag and lifted the grace period before the first node (a) had noticed that a new grace period had started.

We may need to ensure that we can't lift the grace period until all nodes are enforcing.

Actions

Also available in: Atom PDF