Project

General

Profile

Actions

Bug #1795

closed

break d_lock > s_cap_lock ordering

Added by Sage Weil over 12 years ago. Updated over 12 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
fs/ceph
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Dec  7 15:06:27 tgpro1 kernel: =======================================================
Dec  7 15:06:27 tgpro1 kernel: [ INFO: possible circular locking dependency detected ]
Dec  7 15:06:27 tgpro1 kernel: -------------------------------------------------------
Dec  7 15:06:27 tgpro1 kernel: kworker/4:2/3143 is trying to acquire lock:
Dec  7 15:06:27 tgpro1 kernel:  (&sb->s_type->i_lock_key#16){+.+...}, at: [<000ff721>] igrab+0x11/0x41
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: but task is already holding lock:
Dec  7 15:06:27 tgpro1 kernel:  (&(&s->s_cap_lock)->rlock){+.+...}, at: [<006d0dd0>] iterate_session_caps+0x40/0x17d [ceph]
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: which lock already depends on the new lock.
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: the existing dependency chain (in reverse order) is:
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: -> #2 (&(&s->s_cap_lock)->rlock){+.+...}:
Dec  7 15:06:27 tgpro1 kernel:        [<00091456>] lock_acquire+0x42/0x59
Dec  7 15:06:27 tgpro1 kernel:        [<0053f187>] _raw_spin_lock+0x24/0x33
Dec  7 15:06:27 tgpro1 kernel:        [<006c180f>] ceph_d_revalidate+0xf4/0x29e [ceph]
Dec  7 15:06:27 tgpro1 kernel:        [<000f3339>] do_lookup+0x223/0x2a6
Dec  7 15:06:27 tgpro1 kernel:        [<000f4d99>] path_lookupat+0x118/0x6e9
Dec  7 15:06:27 tgpro1 kernel:        [<000f5386>] do_path_lookup+0x1c/0x4e
Dec  7 15:06:27 tgpro1 kernel:        [<000f544c>] user_path_at_empty+0x3e/0x69
Dec  7 15:06:27 tgpro1 kernel:        [<000f5484>] user_path_at+0xd/0xf
Dec  7 15:06:27 tgpro1 kernel:        [<000ecf98>] vfs_fstatat+0x40/0x67
Dec  7 15:06:27 tgpro1 kernel:        [<000ed003>] vfs_lstat+0x16/0x18
Dec  7 15:06:27 tgpro1 kernel:        [<000ed019>] sys_lstat64+0x14/0x28
Dec  7 15:06:27 tgpro1 kernel:        [<0053f9ca>] syscall_call+0x7/0xb
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: -> #1 (&(&dentry->d_lock)->rlock){+.+.-.}:
Dec  7 15:06:27 tgpro1 kernel:        [<00091456>] lock_acquire+0x42/0x59
Dec  7 15:06:27 tgpro1 kernel:        [<0053f187>] _raw_spin_lock+0x24/0x33
Dec  7 15:06:27 tgpro1 kernel:        [<000fca43>] __d_instantiate+0x14/0xbc
Dec  7 15:06:27 tgpro1 kernel:        [<000fcf4e>] d_instantiate+0x30/0x40
Dec  7 15:06:27 tgpro1 kernel:        [<000fe357>] d_alloc_root+0x27/0x2d
Dec  7 15:06:27 tgpro1 kernel:        [<006bd2e2>] open_root_dentry+0xfd/0x143 [ceph]
Dec  7 15:06:27 tgpro1 kernel:        [<006bde68>] ceph_mount+0x338/0x4c4 [ceph]
Dec  7 15:06:27 tgpro1 kernel:        [<000eaac4>] mount_fs+0xe/0x95
Dec  7 15:06:27 tgpro1 kernel:        [<00103266>] vfs_kern_mount+0x4c/0x80
Dec  7 15:06:27 tgpro1 kernel:        [<001046aa>] do_kern_mount+0x2f/0xaf
Dec  7 15:06:27 tgpro1 kernel:        [<001051f0>] do_mount+0x26b/0x2aa
Dec  7 15:06:27 tgpro1 kernel:        [<00105290>] sys_mount+0x61/0x94
Dec  7 15:06:27 tgpro1 kernel:        [<0053f9ca>] syscall_call+0x7/0xb
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: -> #0 (&sb->s_type->i_lock_key#16){+.+...}:
Dec  7 15:06:27 tgpro1 kernel:        [<00090b06>] __lock_acquire+0xe7b/0x1428
Dec  7 15:06:27 tgpro1 kernel:        [<00091456>] lock_acquire+0x42/0x59
Dec  7 15:06:27 tgpro1 kernel:        [<0053f187>] _raw_spin_lock+0x24/0x33
Dec  7 15:06:27 tgpro1 kernel:        [<000ff721>] igrab+0x11/0x41
Dec  7 15:06:27 tgpro1 kernel:        [<006d0e00>] iterate_session_caps+0x70/0x17d [ceph]
Dec  7 15:06:27 tgpro1 kernel:        [<006d3bb4>] send_mds_reconnect+0x319/0x46a [ceph]
Dec  7 15:06:27 tgpro1 kernel:        [<006d401f>] ceph_mdsc_handle_map+0x2f2/0x42d [ceph]
Dec  7 15:06:27 tgpro1 kernel:        [<006be00c>] extra_mon_dispatch+0x18/0x1c [ceph]
Dec  7 15:06:27 tgpro1 kernel:        [<006963ed>] dispatch+0x4f7/0x52a [libceph]
Dec  7 15:06:27 tgpro1 kernel:        [<00694459>] con_work+0x14f4/0x16c3 [libceph]
Dec  7 15:06:27 tgpro1 kernel:        [<0007b331>] process_one_work+0x229/0x397
Dec  7 15:06:27 tgpro1 kernel:        [<0007b80e>] worker_thread+0x182/0x2d8
Dec  7 15:06:27 tgpro1 kernel:        [<0007e843>] kthread+0x62/0x67
Dec  7 15:06:27 tgpro1 kernel:        [<00540c86>] kernel_thread_helper+0x6/0xd
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: other info that might help us debug this:
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: Chain exists of:
Dec  7 15:06:27 tgpro1 kernel:   &sb->s_type->i_lock_key --> &(&dentry->d_lock)->rlock --> &(&s->s_cap_lock)->rlock
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel:  Possible unsafe locking scenario:
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel:        CPU0                    CPU1
Dec  7 15:06:27 tgpro1 kernel:        ----                    ----
Dec  7 15:06:27 tgpro1 kernel:   lock(&(&s->s_cap_lock)->rlock);
Dec  7 15:06:27 tgpro1 kernel:                                lock(&(&dentry->d_lock)->rlock);
Dec  7 15:06:27 tgpro1 kernel:                                lock(&(&s->s_cap_lock)->rlock);
Dec  7 15:06:27 tgpro1 kernel:   lock(&sb->s_type->i_lock_key);
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel:  *** DEADLOCK ***
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: 5 locks held by kworker/4:2/3143:
Dec  7 15:06:27 tgpro1 kernel:  #0:  (ceph-msgr){++++.+}, at: [<0007b2e5>] process_one_work+0x1dd/0x397
Dec  7 15:06:27 tgpro1 kernel:  #1:  ((&(&con->work)->work)){+.+.+.}, at: [<0007b2e5>] process_one_work+0x1dd/0x397
Dec  7 15:06:27 tgpro1 kernel:  #2:  (&s->s_mutex){+.+.+.}, at: [<006d393e>] send_mds_reconnect+0xa3/0x46a [ceph]
Dec  7 15:06:27 tgpro1 kernel:  #3:  (&mdsc->snap_rwsem){+++++.}, at: [<006d3a36>] send_mds_reconnect+0x19b/0x46a [ceph]
Dec  7 15:06:27 tgpro1 kernel:  #4:  (&(&s->s_cap_lock)->rlock){+.+...}, at: [<006d0dd0>] iterate_session_caps+0x40/0x17d [ceph]
Dec  7 15:06:27 tgpro1 kernel: 
Dec  7 15:06:27 tgpro1 kernel: stack backtrace:

Dec  7 15:06:27 tgpro1 kernel: Call Trace:
Dec  7 15:06:27 tgpro1 kernel:  [<0008f840>] print_circular_bug+0x219/0x226
Dec  7 15:06:27 tgpro1 kernel:  [<00090b06>] __lock_acquire+0xe7b/0x1428
Dec  7 15:06:27 tgpro1 kernel:  [<00091456>] lock_acquire+0x42/0x59
Dec  7 15:06:27 tgpro1 kernel:  [<000ff721>] ? igrab+0x11/0x41
Dec  7 15:06:27 tgpro1 kernel:  [<0053f187>] _raw_spin_lock+0x24/0x33
Dec  7 15:06:27 tgpro1 kernel:  [<000ff721>] ? igrab+0x11/0x41
Dec  7 15:06:27 tgpro1 kernel:  [<000ff721>] igrab+0x11/0x41
Dec  7 15:06:27 tgpro1 kernel:  [<006d0e00>] iterate_session_caps+0x70/0x17d [ceph]
Dec  7 15:06:27 tgpro1 kernel:  [<006d5475>] ? ceph_mdsc_submit_request+0x55/0x55 [ceph]
Dec  7 15:06:27 tgpro1 kernel:  [<00694d13>] ? ceph_pagelist_append+0xbc/0xf9 [libceph]
Dec  7 15:06:27 tgpro1 kernel:  [<006d3bb4>] send_mds_reconnect+0x319/0x46a [ceph]
Dec  7 15:06:27 tgpro1 kernel:  [<0008eb6f>] ? trace_hardirqs_on_caller+0x10b/0x13c
Dec  7 15:06:27 tgpro1 kernel:  [<0008ebab>] ? trace_hardirqs_on+0xb/0xd
Dec  7 15:06:27 tgpro1 kernel:  [<006d401f>] ceph_mdsc_handle_map+0x2f2/0x42d [ceph]
Dec  7 15:06:27 tgpro1 kernel:  [<00c40015>] ? 0xc40014
Dec  7 15:06:27 tgpro1 kernel:  [<006be00c>] extra_mon_dispatch+0x18/0x1c [ceph]
Dec  7 15:06:27 tgpro1 kernel:  [<006963ed>] dispatch+0x4f7/0x52a [libceph]
Dec  7 15:06:27 tgpro1 kernel:  [<0008ebab>] ? trace_hardirqs_on+0xb/0xd
Dec  7 15:06:27 tgpro1 kernel:  [<00694459>] con_work+0x14f4/0x16c3 [libceph]
Dec  7 15:06:27 tgpro1 kernel:  [<0007b331>] process_one_work+0x229/0x397
Dec  7 15:06:27 tgpro1 kernel:  [<0007b2e5>] ? process_one_work+0x1dd/0x397
Dec  7 15:06:27 tgpro1 kernel:  [<00692f65>] ? ceph_fault+0x262/0x262 [libceph]
Dec  7 15:06:27 tgpro1 kernel:  [<0007b80e>] worker_thread+0x182/0x2d8
Dec  7 15:06:27 tgpro1 kernel:  [<00002b40>] ? setup_sigcontext+0x24/0x24
Dec  7 15:06:27 tgpro1 kernel:  [<00002b40>] ? setup_sigcontext+0x24/0x24
Dec  7 15:06:27 tgpro1 kernel:  [<0007b68c>] ? rescuer_thread+0x1ed/0x1ed
Dec  7 15:06:27 tgpro1 kernel:  [<0007e843>] kthread+0x62/0x67
Dec  7 15:06:27 tgpro1 kernel:  [<0007e7e1>] ? __init_kthread_worker+0x42/0x42
Dec  7 15:06:27 tgpro1 kernel:  [<00540c86>] kernel_thread_helper+0x6/0xd
Actions

Also available in: Atom PDF