Tasks #1112
closedcheck all igrab at ceph-client,remove deadlock : spin_lock(&inode->i_lock) + igrab(inode)
Description
Hi , at igrab function,it has existed the codes: spin_lock(&inode->i_lock);
if coding this:
spin_lock(&inode->i_lock);
...................
igrab(inode);
.................
spin_unlock(&inode->i_lock);
it is a deadlock.
Jeff ,Wu
linux-2.6.39:
http://lxr.linux.no/#linux+v2.6.39/fs/inode.c#L1144
1143
1144struct inode igrab(struct inode *inode)
1145{
1146 spin_lock(&inode->i_lock);
1147 if (!(inode->i_state & (I_FREEING|I_WILL_FREE))) {
1148 __iget(inode);
1149 spin_unlock(&inode->i_lock);
1150 } else {
1151 spin_unlock(&inode->i_lock);
1152 /
1153 * Handle the case where s_op->clear_inode is not been
1154 * called yet, and somebody is calling igrab
1155 * while the inode is getting freed.
1156 */
1157 inode = NULL;
1158 }
1159 return inode;
1160}
1161EXPORT_SYMBOL(igrab);
Files
Updated by Sage Weil almost 13 years ago
- Target version set to v3.0
Hi Jeff-
Are there actual cases of this that you're seeing? I've fixed several of these, but I'm not aware currently of any others. Or are you just proposing a thorough audit?
Updated by changping Wu almost 13 years ago
- File ceph-client__fsstress_log1 ceph-client__fsstress_log1 added
- File ceph-client_fsstress_log2 ceph-client_fsstress_log2 added
- File ceph-client_fsstress_log3 ceph-client_fsstress_log3 added
Hi ,
I attached some of logs at bug #1096 http://tracker.newdream.net/issues/1096.
:ceph-client-fsstress log 1,2,3.
I git ceph-client master :
commit 35b0ed997b1a49ff73a6110cbd04681467dbe217
Author: Sage Weil <sage@newdream.net>
Date: Wed May 25 14:56:12 2011 -0700
to verify bug 1096, open kernel_hacking spin/mutex lock debug.
but ,it printk log ,which like this : (some of the detail logs attached )
..........................................
[ 2714.810007] NMI backtrace for cpu 1
[ 2714.810007] CPU 1
[ 2714.810007] Modules linked in: ceph libceph nfsd lockd nfs_acl auth_rpcgss sunrpc snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep c
[ 2714.810007]
[ 2714.810007] Pid: 3051, comm: fsstress Not tainted 2.6.39+ #1 Dell Inc. OptiPlex 740/0UT225
[ 2714.810007] RIP: 0010:[<ffffffff812e720f>] [<ffffffff812e720f>] delay_tsc+0x1f/0x80
[ 2714.810007] RSP: 0018:ffff880100261d98 EFLAGS: 00000206
[ 2714.810007] RAX: 0000000000000000 RBX: 00000000012479db RCX: 000000001806526e
[ 2714.810007] RDX: 0000000000002c00 RSI: 0000000000000001 RDI: 0000000000000001
[ 2714.810007] RBP: ffff880100261db8 R08: 0000000000000002 R09: 0000000000000000
[ 2714.810007] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800d4a2a168
[ 2714.810007] R13: 0000000000000001 R14: 0000000000000001 R15: ffff88010fd9a300
[ 2714.810007] FS: 00007f074f487700(0000) GS:ffff88011f400000(0000) knlGS:0000000000000000
[ 2714.810007] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2714.810007] CR2: 00007f12ffcd1000 CR3: 0000000100369000 CR4: 00000000000006e0
[ 2714.810007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2714.810007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2714.810007] Process fsstress (pid: 3051, threadinfo ffff880100260000, task ffff88010fd9a300)
[ 2714.810007] Stack:
[ 2714.810007] 00000000012479db ffff8800d4a2a168 00000000717c75b0 0000000000000000
[ 2714.810007] ffff880100261dc8 ffffffff812e71af ffff880100261e28 ffffffff812f9afc
[ 2714.810007] ffff88010fd9a300 ffff88010fd9a300 ffff88010fd9a9c0 ffff88010fd9a728
[ 2714.810007] Call Trace:
[ 2714.810007] [<ffffffff812e71af>] __delay+0xf/0x20
[ 2714.810007] [<ffffffff812f9afc>] do_raw_spin_lock+0x11c/0x150
[ 2714.810007] [<ffffffff815f61b6>] _raw_spin_lock+0x56/0x70
[ 2714.810007] [<ffffffff811880b5>] ? igrab+0x25/0x60
[ 2714.810007] [<ffffffff811880b5>] igrab+0x25/0x60
[ 2714.810007] [<ffffffffa030f063>] ceph_flush_dirty_caps+0x53/0xb0 [ceph]
[ 2714.810007] [<ffffffff8119b4f0>] ? __sync_filesystem+0x90/0x90
[ 2714.810007] [<ffffffffa03017c9>] ceph_sync_fs+0x49/0x60 [ceph]
[ 2714.810007] [<ffffffff8119b4be>] __sync_filesystem+0x5e/0x90
[ 2714.810007] [<ffffffff8119b50f>] sync_one_sb+0x1f/0x30
[ 2714.810007] [<ffffffff81171317>] iterate_supers+0x77/0xe0
[ 2714.810007] [<ffffffff8119b54f>] sys_sync+0x2f/0x70
[ 2714.810007] [<ffffffff815fee42>] system_call_fastpath+0x16/0x1b
[ 2714.810007] Code: 48 8d 7a 01 ff 15 3a be 97 00 c9 c3 55 48 89 e5 41 56 41 55 41 54 53 66 66 66 66 90 65 44 8b 2c 25 78 dc 00 00 49 89 fe
[ 2714.810007] 66 90 e8 c9 b7 d2 ff 66 90 4c 63 e0 eb 11 66 90 f3 90 65 8b
[ 2714.810007] Call Trace:
[ 2714.810007] [<ffffffff812e71af>] __delay+0xf/0x20
[ 2714.810007] [<ffffffff812f9afc>] do_raw_spin_lock+0x11c/0x150
[ 2714.810007] [<ffffffff815f61b6>] _raw_spin_lock+0x56/0x70
[ 2714.810007] [<ffffffff811880b5>] ? igrab+0x25/0x60
[ 2714.810007] [<ffffffff811880b5>] igrab+0x25/0x60
[ 2714.810007] [<ffffffffa030f063>] ceph_flush_dirty_caps+0x53/0xb0 [ceph]
[ 2714.810007] [<ffffffff8119b4f0>] ? __sync_filesystem+0x90/0x90
[ 2714.810007] [<ffffffffa03017c9>] ceph_sync_fs+0x49/0x60 [ceph]
[ 2714.810007] [<ffffffff8119b4be>] __sync_filesystem+0x5e/0x90
[ 2714.810007] [<ffffffff8119b50f>] sync_one_sb+0x1f/0x30
[ 2714.810007] [<ffffffff81171317>] iterate_supers+0x77/0xe0
[ 2714.810007] [<ffffffff8119b54f>] sys_sync+0x2f/0x70
[ 2714.810007] [<ffffffff815fee42>] system_call_fastpath+0x16/0x1b
.............................................
then i try to close spin_lock/spin_unlock at ceph_flush_dirty_caps to debug this issue.
do fsstress test ,
but it at ceph_set_page_dirty ,printk deadlock issue. next monday ,i will continue to reproduce it and attached logs.
then i add spin_unlock/spin_lock at ceph_set_page_dirty,run fsstress ,this deadlock issue don't exist.
spin_unlock(&inode->i_lock);
igrab(inode);
spin_lock(&inode->i_lock);
..........................
i note that ceph-client exist many igrab , but i'am not sure whether exist other "igrab" code need to modify.so add a task.
Updated by changping Wu almost 13 years ago
static int ceph_set_page_dirty(struct page *page)
{
...............................
/* dirty the head */
spin_lock(&inode->i_lock);
if (ci->i_head_snapc == NULL)
ci->i_head_snapc = ceph_get_snap_context(snapc);
++ci->i_wrbuffer_ref_head;
+spin_unlock(&inode->i_lock);
if (ci->i_wrbuffer_ref == 0)
igrab(inode);
+spin_lock(&inode->i_lock);
++ci->i_wrbuffer_ref;
dout("%p set_page_dirty %p idx %lu head %d/%d > %d/%d " >host, page, page->index,
"snapc %p seq %lld (%d snaps)\n",
mapping
ci->i_wrbuffer_ref-1, ci->i_wrbuffer_ref_head-1,
ci->i_wrbuffer_ref, ci->i_wrbuffer_ref_head,
snapc, snapc->seq, snapc->num_snaps);
spin_unlock(&inode->i_lock);
....................
}
Updated by Sage Weil almost 13 years ago
- Category set to fs/ceph
- Assignee set to Sage Weil
Jeff Wu wrote:
static int ceph_set_page_dirty(struct page *page) {
...............................
/* dirty the head */
spin_lock(&inode->i_lock);
if (ci->i_head_snapc == NULL)
ci->i_head_snapc = ceph_get_snap_context(snapc);
++ci->i_wrbuffer_ref_head;
+spin_unlock(&inode->i_lock);
if (ci->i_wrbuffer_ref == 0)
igrab(inode);
+spin_lock(&inode->i_lock);
This one is fixed in the latest master branch already. The correct way to do this, btw, is to use ihold() instead of igrab().
I also pushed another patch that changes almost all igrab calls to ihold, which is more appropriate when we have a stable reference. This also avoids creating any new and interesting locking dependencies.
Can you see if any of these still come up with the latest code?
Updated by changping Wu almost 13 years ago
Hi ,
I'am verifing fsstress test with ceph-client master branch:
commit 98cc99822dac96710a8b64bdc2be4eccffc78956
Author: Sage Weil <sage@newdream.net>
Date: Fri May 27 09:24:26 2011 -0700
ceph: use ihold when we already have an inode ref
We should use ihold whenever we already have a stable inode ref, even
when we aren't holding i_lock. This avoids adding new and unnecessary
locking dependencies.
Signed-off-by: Sage Weil <sage@newdream.net>
but ,i viewed that it still exist four "igrab" at ceph client,
---- igrab Matches (4 in 3 files) ----
Addr.c (client-source\ceph): igrab(inode);
Mds_client.c (client-source\ceph): inode = igrab(&cap->ci->vfs_inode);
Snap.c (client-source\ceph): igrab(inode);
Snap.c (client-source\ceph): struct inode *inode = igrab(&ci->vfs_inode);
Should these "igrab" not to replace with "ihold"?
Updated by changping Wu almost 13 years ago
Hi ,
ceph_set_page_dirty still exist igrab ,
i merge the patch to ceph-cleint-standalone,
run fsstress, still hit a deadlock issue.
static int ceph_set_page_dirty(struct page *page)
...............................
if (ci->i_wrbuffer_ref == 0)
igrab(inode);
..................................