Bug #3939
closedkrbd: circular locking report in sysfs code
0%
Description
I intended to write this up before but don't think I did.
I'm getting a "possible circular locking dependency detected"
lockdep report (below). It is either from the rbd/map-unmap.sh
or the rbd/kernel.sh teuthology workunit. (I suspect it's the
former because of the area of code in question.)
I am running tests using the new request code. This new code
has not changed the sysfs stuff, so I don't believe this is
a new problem.
This particular bug should be identified, but I've thought for
a while that the locking surrounding the sysfs stuff might be
deserving of a sort of comprehensive review.
Note also that fixing http://tracker.newdream.net/3427 might
also affect whether this problem can occur. It would be best
to understand the problem before assuming that though.
[ 521.833110] [ 521.851538] ====================================================== [ 521.876458] [ INFO: possible circular locking dependency detected ] [ 521.901795] 3.6.0-ceph-00224-g1370216 #1 Not tainted [ 521.925849] ------------------------------------------------------- [ 521.952098] tee/20164 is trying to acquire lock: [ 521.976916] (ctl_mutex/1){+.+.+.}, at: [<ffffffffa02e89b7>] rbd_dev_refresh+0x47/0x2f0 [rbd] [ 522.028191] [ 522.028191] but task is already holding lock: [ 522.075418] (s_active#92){++++.+}, at: [<ffffffff811ec66d>] sysfs_write_file+0xcd/0x170 [ 522.128236] [ 522.128236] which lock already depends on the new lock. [ 522.128236] [ 522.203569] [ 522.203569] the existing dependency chain (in reverse order) is: [ 522.254452] -> #1 (s_active#92){++++.+}: [ 522.302613] [<ffffffff810b2e22>] lock_acquire+0xa2/0x140 [ 522.332279] [<ffffffff811ed896>] sysfs_deactivate+0x116/0x160 [ 522.362442] [<ffffffff811ee37b>] sysfs_addrm_finish+0x3b/0x70 [ 522.392407] [<ffffffff811ec3cb>] sysfs_hash_and_remove+0x5b/0xb0 [ 522.423260] [<ffffffff811f01c1>] remove_files.isra.1+0x31/0x40 [ 522.454303] [<ffffffff811f05bd>] sysfs_remove_group+0x4d/0x100 [ 522.485679] [<ffffffff813f67eb>] device_remove_groups+0x3b/0x60 [ 522.517943] [<ffffffff813f6ae4>] device_remove_attrs+0x44/0x80 [ 522.550492] [<ffffffff813f73d5>] device_del+0x125/0x1c0 [ 522.582747] [<ffffffff813f7492>] device_unregister+0x22/0x60 [ 522.615507] [<ffffffffa02e6d6f>] rbd_remove+0x1bf/0x1d0 [rbd] [ 522.648312] [<ffffffff813f8b97>] bus_attr_store+0x27/0x30 [ 522.680342] [<ffffffff811ec686>] sysfs_write_file+0xe6/0x170 [ 522.712192] [<ffffffff8117bee3>] vfs_write+0xb3/0x180 [ 522.743032] [<ffffffff8117c20a>] sys_write+0x4a/0x90 [ 522.772503] [<ffffffff8163e1e9>] system_call_fastpath+0x16/0x1b [ 522.802689] -> #0 (ctl_mutex/1){+.+.+.}: [ 522.850774] [<ffffffff810b2788>] __lock_acquire+0x1ac8/0x1b90 [ 522.881017] [<ffffffff810b2e22>] lock_acquire+0xa2/0x140 [ 522.910390] [<ffffffff8163253b>] mutex_lock_nested+0x4b/0x320 [ 522.939702] [<ffffffffa02e89b7>] rbd_dev_refresh+0x47/0x2f0 [rbd] [ 522.968812] [<ffffffffa02e8c7f>] rbd_image_refresh+0x1f/0x40 [rbd] [ 522.998452] [<ffffffff813f6128>] dev_attr_store+0x18/0x30 [ 523.027243] [<ffffffff811ec686>] sysfs_write_file+0xe6/0x170 [ 523.056409] [<ffffffff8117bee3>] vfs_write+0xb3/0x180 [ 523.085407] [<ffffffff8117c20a>] sys_write+0x4a/0x90 [ 523.113700] [<ffffffff8163e1e9>] system_call_fastpath+0x16/0x1b [ 523.142912] [ 523.142912] other info that might help us debug this: [ 523.142912] [ 523.216639] Possible unsafe locking scenario: [ 523.216639] [ 523.266500] CPU0 CPU1 [ 523.292750] ---- ---- [ 523.318539] lock(s_active#92); [ 523.342655] lock(ctl_mutex/1); [ 523.370166] lock(s_active#92); [ 523.397380] lock(ctl_mutex/1); [ 523.421779] [ 523.421779] *** DEADLOCK *** [ 523.421779] [ 523.487594] 2 locks held by tee/20164: [ 523.511118] #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff811ec5e4>] sysfs_write_file+0x44/0x170 [ 523.562758] #1: (s_active#92){++++.+}, at: [<ffffffff811ec66d>] sysfs_write_file+0xcd/0x170 [ 523.616541] [ 523.616541] stack backtrace: [ 523.663438] Pid: 20164, comm: tee Not tainted 3.6.0-ceph-00224-g1370216 #1 [ 523.693567] Call Trace: [ 523.717950] [<ffffffff8162b274>] print_circular_bug+0x1fb/0x20c [ 523.747238] [<ffffffff81085fec>] ? ttwu_stat+0x4c/0x140 [ 523.775497] [<ffffffff810b2788>] __lock_acquire+0x1ac8/0x1b90 [ 523.804350] [<ffffffff810b393d>] ? trace_hardirqs_on+0xd/0x10 [ 523.833220] [<ffffffffa02e89b7>] ? rbd_dev_refresh+0x47/0x2f0 [rbd] [ 523.863191] [<ffffffff810b2e22>] lock_acquire+0xa2/0x140 [ 523.892078] [<ffffffffa02e89b7>] ? rbd_dev_refresh+0x47/0x2f0 [rbd] [ 523.921875] [<ffffffff8163253b>] mutex_lock_nested+0x4b/0x320 [ 523.951096] [<ffffffffa02e89b7>] ? rbd_dev_refresh+0x47/0x2f0 [rbd] [ 523.980991] [<ffffffff810b28f0>] ? lock_release_non_nested+0xa0/0x310 [ 524.011289] [<ffffffffa02e89b7>] rbd_dev_refresh+0x47/0x2f0 [rbd] [ 524.041564] [<ffffffffa02e8c7f>] rbd_image_refresh+0x1f/0x40 [rbd] [ 524.072407] [<ffffffff811ec66d>] ? sysfs_write_file+0xcd/0x170 [ 524.103471] [<ffffffff813f6128>] dev_attr_store+0x18/0x30 [ 524.134233] [<ffffffff811ec686>] sysfs_write_file+0xe6/0x170 [ 524.165228] [<ffffffff8117bee3>] vfs_write+0xb3/0x180 [ 524.195866] [<ffffffff8117c20a>] sys_write+0x4a/0x90 [ 524.225908] [<ffffffff8163e1e9>] system_call_fastpath+0x16/0x1b
Updated by Alex Elder about 11 years ago
- Project changed from Linux kernel client to rbd
Updated by Alex Elder about 11 years ago
- Status changed from New to Duplicate
Duplicate of 3925. I did write it up before.