Bug #4619: mds: anchortable hangs on new cluster - CephFS - Ceph

Actions

Copy link

Bug #4619

closed

mds: anchortable hangs on new cluster

Added by Sage Weil about 11 years ago. Updated about 11 years ago.

Status:

Resolved

Priority:

High

Assignee:

Category:

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Actions

Copy link

Updated by Sage Weil about 11 years ago

Status changed from New to Fix Under Review

wip-4619

Actions

Copy link

Updated by Greg Farnum about 11 years ago

Code looks good, assuming the tests run.

Sorry about that! :(

Actions

Copy link

Updated by Zheng Yan about 11 years ago

I think this isn't correct. If we restart the table server MDS, it will send two ready messages to the table client. One by MDS::handle_mds_recovery(), one by MDS::recovery_done(). I think it's better to call MDS::recovery_done() when bringing up a fresh cluster.

Actions

Copy link

Updated by Sage Weil about 11 years ago

Project changed from Ceph to CephFS
Category deleted (1)
Status changed from Fix Under Review to Resolved

commit:968c6c0c9408b33904041e5ddbd9ea738e831713

Actions

Copy link

Updated by Greg Farnum about 11 years ago

Status changed from Resolved to In Progress

Sage said he'd look at the double-send as well.

Actions

Copy link

Updated by Greg Farnum about 11 years ago

Priority changed from Urgent to High

Actions

Copy link

Updated by Sage Weil about 11 years ago

Status changed from In Progress to Fix Under Review

recovery_done() breaks on a fresh machine because of the populate_mydir() ordering. The problem is that both recovery_done() and handle_mds_recovery(who) will catch this case, since the recovery_done() sends to everyone who is active.

I think a simpler fix is to handle the create/start case separately in boot_create()... where there is also handling for hte talbeservers. See wip-4619...

Actions

Copy link