Bug #11783
closedprotocol: flushing caps on MDS restart can go bad
0%
Description
Not consistent, not happening on master.
http://pulpito.ceph.com/teuthology-2015-05-16_23:04:02-fs-next-testing-basic-multi/896420/
2015-05-17T15:48:28.604 INFO:tasks.workunit.client.0.plana46.stdout:------------------- 2015-05-17T15:48:28.604 INFO:tasks.workunit.client.0.plana46.stdout:../pjd-fstest-20090130-RC/tests/rename/09.t (Wstat: 0 Tests: 56 Failed: 3) 2015-05-17T15:48:28.605 INFO:tasks.workunit.client.0.plana46.stdout: Failed tests: 12-13, 15 2015-05-17T15:48:28.605 INFO:tasks.workunit.client.0.plana46.stdout:Files=191, Tests=1964, 315 wallclock secs ( 3.70 usr 3.48 sys + 5.08 cusr 7.35 csys = 19.61 CPU) 2015-05-17T15:48:28.605 INFO:tasks.workunit.client.0.plana46.stdout:Result: FAIL
Updated by Greg Farnum almost 9 years ago
Yep, this one looks unfamiliar to me. :( Do we have client logs from when it happened that we can reference?
Updated by Zheng Yan almost 9 years ago
- Status changed from New to In Progress
this is a message ordering issue when MDS failover.
chown marks Ax dirty
client flushes and releases Ax cap
chown send a setattr request to MDS
MDS failovers
client re-sends the setattr request
client send cap_reconnect
MDS gets reovered
client re-sends the cap message to flush Ax cap
Updated by Greg Farnum almost 9 years ago
- Subject changed from cfuse_workunit_suites_pjd failure on next to protocol: flushing caps on MDS restart can go bad
- Assignee set to Zheng Yan
Updated by Zheng Yan over 8 years ago
Updated by Greg Farnum over 8 years ago
- Status changed from In Progress to Fix Under Review
Updated by Greg Farnum over 8 years ago
- Status changed from Fix Under Review to 7
- Priority changed from Normal to High
I merged this by checking that it's working manually, but the testing isn't behaving properly so I haven't merged that yet. In http://pulpito.ceph.com/ubuntu-2015-09-22_15:56:36-fs-greg-fs-testing---basic-multi/1064720/ we have an example of the failure injecting either not being set or not being picked up by the client.
However, I'm not super-comfortable without that test coverage in-tree, so let's try and figure it out quickly.
Updated by Greg Farnum over 8 years ago
I guess I should note that I only saw this the once and it also included a (slightly outdated) version of the vstart runner branch, so there could be some interplay going on there if it broke conf file handling.