Project

General

Profile

Actions

Bug #1243

closed

inest lock blocks dir create for a long time

Added by Greg Farnum almost 13 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

From the mailing list:
steps are:
we mount ceph on /mnt/test/
then create dir "/mnt/test/a/b/"
1) in dir "b" , use "seq 3000|xargs -i mkdir {}" to create 3000 dirs
2) and at some time,make a directory "c" in "a"
from the mds debug log:

2011-06-29 05:44:19.368961 7f7a0b421700 mds0.locker wrlock_start
waiting on (inest lock->sync w=1 dirty flushing) on [inode 10000000000
[...2,head] /a/ auth v18 pv20 ap=312 f(v0 m2011-06-29 05:44:15.550665
2=0+2) n(v0 rc2011-06-29 05:44:15.550665 1934=0+1934) (iauth sync r=1)
(isnap sync r=1) (inest lock->sync w=1 dirty flushing) (ifile excl
w=1) (ixattr excl) (iversion lock) caps={4099=pAsLsXsxFsx/-@10},l=4099 | dirtyscattered lock dirfrag caps dirty authpin 0x14c97e0]

we find:
the dir "a" was locked when we create dirs below dir "b"
in function predirty_journal_parents (in MDCache.cc ), the flag "stop"
was marked true,so we got the message "predirty_journal_parents stop.
marking nestlock on".
step 1) got a lock of dir "a", it type is CEPH_LOCK_INEST , it name
is " sync "
and the value of this lock is "inest lock->sync w=1 dirty flushing".

I reproduced this locally on the kernel client but have been unable to get it on cfuse. The delay is on the MDS side though; it takes some 53 seconds after the request comes in for the reply to go out and the only recorded wait is on the inest lock while in the "lock->sync" state and flushing.

Caps problem? Order of wakeup problem? Flushing problem?

Actions

Also available in: Atom PDF