Bug #1063
closeddbench breaks if MDS and client times aren't synced
0%
Description
http://autotest.ceph.newdream.net/afe/#tab_id=view_job&object_id=554
one mds, one osd, cfuse
dbench never completes, just keeps saying "cleanup NNN sec" with increasing numbers, manually poking on the node hangs as soon as you touch the mountpoint.
"ceph health" and "ceph -s" looked normal.
Feel free to clone the above autotest job to get a running instance of this, it seems to fail reliably.
Updated by Sage Weil almost 13 years ago
- Assignee set to Sage Weil
- Priority changed from Normal to High
this is probably a kclient thing.. testing against latest for-linus
Updated by Anonymous almost 13 years ago
Note that the test I ran was on cfuse (most likely because I had kclient trouble, and wanted to isolate that out).
I just cloned the job to http://autotest.ceph.newdream.net/afe/#tab_id=view_job&object_id=555 and that one uses kclient. Results in ~20 minutes.
Updated by Anonymous almost 13 years ago
Job 555 broke, here's a re-run: http://autotest.ceph.newdream.net/afe/#tab_id=view_job&object_id=556
And that confirms that kclient works just fine.
Though it's interesting to see that the "cleanup" phase of dbench took 606 seconds = 10 minutes. On cfuse, I aborted after 744 seconds of "cleanup", after seeing an ls hang. So maybe this is just both being problematic, and cfuse being way slower. I'll rerun cfuse and let it sit for a long time.
kclient dbench log:
http://autotest.ceph.newdream.net/results/556-tv/group0/sepia60.ceph.dreamhost.com/debug/client.0.DEBUG
cfuse dbench log:
http://autotest.ceph.newdream.net/results/554-tv/group0/sepia63.ceph.dreamhost.com/ceph_dbench.cluster0/debug/ceph_dbench.cluster0.DEBUG
Updated by Anonymous almost 13 years ago
My bad, the cleanup phase starts at 600 seconds, so kclient only had a few seconds of cleanup.
The cfuse re-run is at http://autotest.ceph.newdream.net/afe/#tab_id=view_job&object_id=560
Updated by Sage Weil almost 13 years ago
- Assignee changed from Sage Weil to Greg Farnum
- Target version set to v0.27.1
Updated by Anonymous almost 13 years ago
Job 560 has spent 1.5 hours in cleanup now, aborting.
14:40:26 DEBUG| [stdout] 2 19 0.00 MB/sec cleanup 6288 sec
Updated by Greg Farnum almost 13 years ago
I'm unable to reproduce this on my own machine, and after looking through the mds logs from autotest everything looks good. I'll need a debug log from the client to diagnose this, but it's not clear to me where or how to turn that on with our autotest system.
Updated by Greg Farnum almost 13 years ago
- Status changed from New to In Progress
Ran this with client debugging enabled (job 573). Not certain this is the problem, but it looks like the problem is the client is marking a few inodes' caps dirty and then never sending them to the MDS. I certainly can't find anything else wrong! (no hanging requests, no waiting for max_size to get updated...)
Thought I don't know why keeping some caps dirty would break anything either. :/
Updated by Sage Weil almost 13 years ago
- Target version changed from v0.27.1 to v0.29
- Translation missing: en.field_position set to 4
Updated by Sage Weil almost 13 years ago
- Translation missing: en.field_story_points set to 2
- Translation missing: en.field_position deleted (
4) - Translation missing: en.field_position set to 4
Updated by Greg Farnum almost 13 years ago
Okay, hopefully we can rerun this with a time-synced cluster soon and see if that's what is causing the breakage.
I'm seeing a few problems here as well, but not a root cause that would explain what's starting the client down the wrong path in the first place. Most specifically:
It's possible to go into check_caps with an inode popped off the delayed_caps list, then not send the cap update to the MDS. This effectively "loses" the dirtied cap.
There are other things scaring me with those checks. Nothing looks at dirty_caps until you're in the process of sending and while it's possible the checks are valid if all invariants hold, they're obviously fragile.
Updated by Greg Farnum almost 13 years ago
- Subject changed from ceph_dbench autotest never completes to dbench breaks if MDS and client times aren't synced
Well, job 576 completed successfully after TV time-synced the cluster. Looks like bad mtimes are somehow causing the problem and the uclient is sensitive to them.
I've got that possible fix in my tree but I've discovered a regression in fsstress that I'd like to locate the cause of before I push it, in case they're related.
Updated by Greg Farnum almost 13 years ago
On the other hand, adding a clock skew option and setting the MDS into the future doesn't let me reproduce the brokenness when running locally.
Updated by Greg Farnum almost 13 years ago
Scratch that, I did manage to reproduce locally. It just took a bit longer.
Updated by Greg Farnum almost 13 years ago
- Status changed from In Progress to Can't reproduce
I won't be surprised if this comes back again, but I can't reproduce it and there've been several fixes for client caps, etc in the meantime.
Updated by John Spray over 7 years ago
- Project changed from Ceph to CephFS
- Category deleted (
11) - Target version deleted (
v0.29)
Bulk updating project=ceph category=ceph-fuse issues to move to fs project so that we can remove the ceph-fuse category from the ceph project