Actions
Bug #718
closedsync hangs
% Done:
0%
Spent time:
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
There were a few reports, and I'm also seeing it on current unstable. Haven't checked other branches yet.
In one case, reproducing was as simple as echo foo > mnt/foo ; sync, although later that did work. This time it was a partial kernel untar, control-z, sync.
Updated by Sage Weil over 13 years ago
on v2.6.37,
[ 802.320000] libceph: osd_client.c:1421 : sync done (thru tid 7008) [ 802.320000] ceph: mds_client.c:3098 : sync [ 802.320000] ceph: mds_client.c:3103 : sync want tid 16370 flush_seq 8705 [ 802.320000] ceph: caps.c:2886 : flush_dirty_caps [ 802.320000] ceph: mds_client.c:3055 : wait_unsafe_requests want 16370 [ 802.320000] ceph: mds_client.c:3088 : wait_unsafe_requests done [ 802.320000] ceph: mds_client.c:1249 : check_cap_flush want 8705 [ 802.320000] ceph: mds_client.c:310 : mdsc get_session 000000008076a7f0 1 -> 2 [ 802.320000] ceph: mds_client.c:1272 : check_cap_flush still flushing 000000007b0d3d40 seq 2415 <= 8705 to mds0 [ 802.320000] ceph: mds_client.c:321 : mdsc put_session 000000008076a7f0 2 -> 1 [ 802.320000] ceph: mds_client.c:1249 : check_cap_flush want 8705 [ 802.320000] ceph: mds_client.c:310 : mdsc get_session 000000008076a7f0 1 -> 2 [ 802.320000] ceph: mds_client.c:1272 : check_cap_flush still flushing 000000007b0d3d40 seq 2415 <= 8705 to mds0 ... and then no more mention check_cap_flush
So the problem appears to be with the mds cap flushing... and either server-side or present is 2.6.37.
Updated by Sage Weil over 13 years ago
- Category set to Clustered MDS
- Status changed from New to Resolved
- Assignee set to Sage Weil
fixed in unstable branch, commit:55ee8fe37598475ed363d078cb50d19e0524c69f 'ceph: fix flushing of caps vs cap import'
Actions