Project

General

Profile

Actions

Bug #147

closed

lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex

Added by Sage Weil almost 14 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
libceph
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

[44161.467025] =========================================================
[44161.470334] [ INFO: possible irq lock inversion dependency detected ]
[44161.470334] 2.6.34 #29
[44161.470334] ---------------------------------------------------------
[44161.470334] kswapd0/356 just changed the state of lock:
[44161.470334]  (&osdc->request_mutex){+.+.-.}, at: [<ffffffffa0071364>] ceph_osdc_start_request+0x62/0x293 [ceph]
[44161.470334] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[44161.470334]  (&con->mutex){+.+.+.}
[44161.470334] 
[44161.470334] and interrupts could create inverse lock ordering between them.
[44161.470334] 
[44161.470334] 
[44161.470334] other info that might help us debug this:
[44161.470334] no locks held by kswapd0/356.
[44161.470334] 
[44161.470334] the shortest dependencies between 2nd lock and 1st lock:
[44161.470334]  -> (&con->mutex){+.+.+.} ops: 42297456 {
[44161.470334]     HARDIRQ-ON-W at:
[44161.470334]                                          [<ffffffff8105aaed>] __lock_acquire+0x64e/0x851
[44161.470334]                                          [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.470334]                                          [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.470334]                                          [<ffffffffa0065d56>] ceph_con_revoke+0x22/0x165 [ceph]
[44161.470334]                                          [<ffffffffa006e2c0>] __send_prepared_auth_request+0x41/0x6b [ceph]
[44161.470334]                                          [<ffffffffa006e557>] __open_session+0x12f/0x202 [ceph]
[44161.470334]                                          [<ffffffffa006eeae>] ceph_monc_open_session+0x7c/0x93 [ceph]
[44161.470334]                                          [<ffffffffa00472a3>] __ceph_open_session+0x92/0x211 [ceph]
[44161.470334]                                          [<ffffffffa0048ce1>] ceph_get_sb+0x1d4/0x555 [ceph]
[44161.470334]                                          [<ffffffff810b1dc6>] vfs_kern_mount+0xaf/0x183
[44161.470334]                                          [<ffffffff810b1ef7>] do_kern_mount+0x47/0xee
[44161.470334]                                          [<ffffffff810c6e5e>] do_mount+0x74d/0x7c8
[44161.470334]                                          [<ffffffff810c6f58>] sys_mount+0x7f/0xbf
[44161.470334]                                          [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b
[44161.470334]     SOFTIRQ-ON-W at:
[44161.470334]                                          [<ffffffff8105ab0f>] __lock_acquire+0x670/0x851
[44161.470334]                                          [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.470334]                                          [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.470334]                                          [<ffffffffa0065d56>] ceph_con_revoke+0x22/0x165 [ceph]
[44161.734219]                                          [<ffffffffa006e2c0>] __send_prepared_auth_request+0x41/0x6b [ceph]
[44161.734219]                                          [<ffffffffa006e557>] __open_session+0x12f/0x202 [ceph]
[44161.734219]                                          [<ffffffffa006eeae>] ceph_monc_open_session+0x7c/0x93 [ceph]
[44161.734219]                                          [<ffffffffa00472a3>] __ceph_open_session+0x92/0x211 [ceph]
[44161.734219]                                          [<ffffffffa0048ce1>] ceph_get_sb+0x1d4/0x555 [ceph]
[44161.734219]                                          [<ffffffff810b1dc6>] vfs_kern_mount+0xaf/0x183
[44161.734219]                                          [<ffffffff810b1ef7>] do_kern_mount+0x47/0xee
[44161.734219]                                          [<ffffffff810c6e5e>] do_mount+0x74d/0x7c8
[44161.734219]                                          [<ffffffff810c6f58>] sys_mount+0x7f/0xbf
[44161.734219]                                          [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b
[44161.734219]     RECLAIM_FS-ON-W at:
[44161.734219]                                             [<ffffffff81057f1c>] mark_held_locks+0x49/0x64
[44161.734219]                                             [<ffffffff81058625>] lockdep_trace_alloc+0xa5/0xc3
[44161.734219]                                             [<ffffffff810a90c9>] kmem_cache_alloc+0x2a/0x144
[44161.734219]                                             [<ffffffff81387009>] sock_alloc_inode+0x1a/0x6d
[44161.734219]                                             [<ffffffff810c2a83>] alloc_inode+0x18/0x76
[44161.734219]                                             [<ffffffff810c30fc>] new_inode+0x21/0x99
[44161.734219]                                             [<ffffffff81389499>] sock_alloc+0x14/0x58
[44161.734219]                                             [<ffffffff81389c46>] __sock_create+0xa0/0x2d3
[44161.734219]                                             [<ffffffff81389e98>] sock_create_kern+0x1f/0x21
[44161.734219]                                             [<ffffffffa0064705>] try_write+0x1d4/0x10eb [ceph]
[44161.734219]                                             [<ffffffffa0066135>] con_work+0x135/0x6b2 [ceph]
[44161.734219]                                             [<ffffffff8104786b>] worker_thread+0x1e8/0x2fa
[44161.734219]                                             [<ffffffff8104a65e>] kthread+0x7d/0x85
[44161.734219]                                             [<ffffffff810037d4>] kernel_thread_helper+0x4/0x10
[44161.734219]     INITIAL USE at:
[44161.734219]                                         [<ffffffff8105ab5a>] __lock_acquire+0x6bb/0x851
[44161.734219]                                         [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.734219]                                         [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.734219]                                         [<ffffffffa0065d56>] ceph_con_revoke+0x22/0x165 [ceph]
[44161.734219]                                         [<ffffffffa006e2c0>] __send_prepared_auth_request+0x41/0x6b [ceph]
[44161.734219]                                         [<ffffffffa006e557>] __open_session+0x12f/0x202 [ceph]
[44161.734219]                                         [<ffffffffa006eeae>] ceph_monc_open_session+0x7c/0x93 [ceph]
[44161.734219]                                         [<ffffffffa00472a3>] __ceph_open_session+0x92/0x211 [ceph]
[44161.734219]                                         [<ffffffffa0048ce1>] ceph_get_sb+0x1d4/0x555 [ceph]
[44161.734219]                                         [<ffffffff810b1dc6>] vfs_kern_mount+0xaf/0x183
[44161.734219]                                         [<ffffffff810b1ef7>] do_kern_mount+0x47/0xee
[44161.734219]                                         [<ffffffff810c6e5e>] do_mount+0x74d/0x7c8
[44161.734219]                                         [<ffffffff810c6f58>] sys_mount+0x7f/0xbf
[44161.734219]                                         [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b
[44161.734219]   }
[44161.734219]   ... key      at: [<ffffffffa009f740>] __key.50466+0x0/0xfffffffffffdc726 [ceph]
[44161.734219]   ... acquired at:
[44161.734219]    [<ffffffff8105a1c4>/0x1a5 [ceph]
[4 [<ffffffffa0051cde>] ceph_aio_read+0x684/0x9c4 [ceph]
[44161.734219]    [<ffffffff810af91b>] do_sync_read+0xcb/0x112
[44161.734219]    [<ffffffff810b00c8>] vfs_read+0xab/0x147
[44161.734219]    [<ffffffff810b041d>] sys_read+0x47/0x70
[44161.734219]    [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b
[44161.734219] 
[44161.734219] -> (&osdc->request_mutex){+.+.-.} ops: 5330322 {
[44161.734219]    HARDIRQ-ON-W at:
[44161.734219]                                        [<ffffffff8105aaed>] __lock_acquire+0x64e/0x851
[44161.734219]                                        [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.734219]                                        [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.734219]                                        [<ffffffffa0071aaa>] kick_requests+0x24/0x9d [ceph]
[44161.734219]                                        [<ffffffffa0072030>] ceph_osdc_handle_map+0x36e/0x53c [ceph]
[44161.734219]                                        [<ffffffffa006f783>] dispatch+0x4c7/0x5f7 [ceph]
[44161.734219]                                        [<ffffffffa0063ef2>] try_read+0xd98/0x13d7 [ceph]
[44161.734219]                                        [<ffffffffa0066129>] con_work+0x129/0x6b2 [ceph]
[44161.734219]                                        [<ffffffff8104786b>] worker_thread+0x1e8/0x2fa
[44161.734219]                                        [<ffffffff8104a65e>] kthread+0x7d/0x85
[44161.734219]                                        [<ffffffff810037d4>] kernel_thread_helper+0x4/0x10
[44161.734219]    SOFTIRQ-ON-W at:
[44161.734219]                                        [<ffffffff8105ab0f>] __lock_acquire+0x670/0x851
[44161.734219]                                        [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.734219]                                        [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.734219]                                        [<ffffffffa0071aaa>] kick_requests+0x24/0x9d [ceph]
[44161.734219]                                        [<ffffffffa0072030>] ceph_osdc_handle_map+0x36e/0x53c [ceph]
[44161.734219]                                        [<ffffffffa006f783>] dispatch+0x4c7/0x5f7 [ceph]
[44161.734219]                                        [<ffffffffa0063ef2>] try_read+0xd98/0x13d7 [ceph]
[44161.734219]                                        [<ffffffffa0066129>] con_work+0x129/0x6b2 [ceph]
[44161.734219]                                        [<ffffffff8104786b>] worker_thread+0x1e8/0x2fa
[44161.734219]                                        [<ffffffff8104a65e>] kthread+0x7d/0x85
[44161.734219]                                        [<ffffffff810037d4>] kernel_thread_helper+0x4/0x10
[44161.734219]    IN-RECLAIM_FS-W at:
[44161.734219]                                           [<ffffffff8105ab42>] __lock_acquire+0x6a3/0x851
[44161.734219]                                           [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.734219]                                           [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.734219]                                           [<ffffffffa0071364>] ceph_osdc_start_request+0x62/0x293 [ceph]
[44161.734219]                                           [<ffffffffa0072be1>] ceph_osdc_writepages+0x102/0x1bf [ceph]
[44161.734219]                                           [<ffffffffa00536e2>] writepage_nounlock+0x293/0x4ff [ceph]
[44161.734219]                                           [<ffffffffa005430e>] ceph_writepage+0x36/0x52 [ceph]
[44161.734219]                                           [<ffffffff81088d25>] shrink_page_list+0x37f/0x654
[44161.734219]                                           [<ffffffff81089602>] shrink_zone+0x608/0x979
[44161.734219]                                           [<ffffffff8108a5d9>] kswapd+0x477/0x682
[44161.734219]                                           [<ffffffff8104a65e>] kthread+0x7d/0x85
[44161.734219]                                           [<ffffffff810037d4>] kernel_thread_helper+0x4/0x10
[44161.734219]    INITIAL USE at:
[44161.734219]                                       [<ffffffff8105ab5a>] __lock_acquire+0x6bb/0x851
[44161.734219]                                       [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.734219]                                       [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.734219]                                       [<ffffffffa0071aaa>] kick_requests+0x24/0x9d [ceph]
[44161.734219]                                       [<ffffffffa0072030>] ceph_osdc_handle_map+0x36e/0x53c [ceph]
[44161.734219]                                       [<ffffffffa006f783>] dispatch+0x4c7/0x5f7 [ceph]
[44161.734219]                                       [<ffffffffa0063ef2>] try_read+0xd98/0x13d7 [ceph]
[44161.734219]                                       [<ffffffffa0066129>] con_work+0x129/0x6b2 [ceph]
[44161.734219]                                       [<ffffffff8104786b>] worker_thread+0x1e8/0x2fa
[44161.734219]                                       [<ffffffff8104a65e>] kthread+0x7d/0x85
[44161.734219]                                       [<ffffffff810037d4>] kernel_thread_helper+0x4/0x10
[44161.734219]  }
[44161.734219]  ... key      at: [<ffffffffa009ff38>] __key.31276+0x0/0xfffffffffffdbf2e [ceph]
[44161.734219]  ... acquired at:
[44161.734219]    [<ffffffff8105b888>] check_usage_forwards+0xbb/0xca
[44161.734219]    [<ffffffff81057c98>] mark_lock+0x312/0x54d
[44161.734219]    [<ffffffff8105ab42>] __lock_acquire+0x6a3/0x851
[44161.734219]    [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.734219]    [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.734219]    [<ffffffffa0071364>] ceph_osdc_start_request+0x62/0x293 [ceph]
[44161.734219]    [<ffffffffa0072be1>] ceph_osdc_writepages+0x102/0x1bf [ceph]
[44161.734219]    [<ffffffffa00536e2>] writepage_nounlock+0x293/0x4ff [ceph]
[44161.734219]    [<ffffffffa005430e>] ceph_writepage+0x36/0x52 [ceph]
[44161.734219]    [<ffffffff81088d25>] shrink_page_list+0x37f/0x654
[44161.734219]    [<ffffffff81089602>] shrink_zone+0x608/0x979
[44161.734219]    [<ffffffff8108a5d9>] kswapd+0x477/0x682
[44161.734219]    [<ffffffff8104a65e>] kthread+0x7d/0x85
[44161.734219]    [<ffffffff810037d4>] kernel_thread_helper+0x4/0x10
[44161.734219] 
[44161.734219] 
[44161.734219] stack backtrace:
[44161.734219] Pid: 356, comm: kswapd0 Not tainted 2.6.34 #29
[44161.734219] Call Trace:
[44161.734219]  [<ffffffff810577b4>] print_irq_inversion_bug+0x121/0x12f
[44161.734219]  [<ffffffff8105b888>] check_usage_forwards+0xbb/0xca
[44161.734219]  [<ffffffff810a7b12>] ? cache_alloc_debugcheck_after+0x40/0x13b
[44161.734219]  [<ffffffff8105b7cd>] ? check_usage_forwards+0x0/0xca
[44161.734219]  [<ffffffff81057c98>] mark_lock+0x312/0x54d
[44161.734219]  [<ffffffff8105ab42>] __lock_acquire+0x6a3/0x851
[44161.734219]  [<ffffffff812242ae>] ? sprintf+0x68/0x6a
[44161.734219]  [<ffffffff8105ad78>] lock_acquire+0x88/0xa5
[44161.734219]  [<ffffffffa0071364>] ? ceph_osdc_start_request+0x62/0x293 [ceph]
[44161.734219]  [<ffffffffa0071364>] ? ceph_osdc_start_request+0x62/0x293 [ceph]
[44161.734219]  [<ffffffff81426abf>] mutex_lock_nested+0x62/0x32c
[44161.734219]  [<ffffffffa0071364>] ? ceph_osdc_start_request+0x62/0x293 [ceph]
[44161.734219]  [<ffffffffa0072acd>] ? ceph_osdc_new_request+0x111/0x123 [ceph]
[44161.734219]  [<ffffffffa0071364>] ceph_osdc_start_request+0x62/0x293 [ceph]
[44161.734219]  [<ffffffffa0072be1>] ceph_osdc_writepages+0x102/0x1bf [ceph]
[44161.734219]  [<ffffffffa00536e2>] writepage_nounlock+0x293/0x4ff [ceph]
[44161.734219]  [<ffffffffa005430e>] ceph_writepage+0x36/0x52 [ceph]
[44161.734219]  [<ffffffff81088d25>] shrink_page_list+0x37f/0x654
[44161.734219]  [<ffffffff8104f3a2>] ? sched_clock_local+0x11/0x73
[44161.734219]  [<ffffffff8104f4ca>] ? sched_clock_cpu+0xc6/0xd4
[44161.734219]  [<ffffffff810567b8>] ? get_lock_stats+0x19/0x4c
[44161.734219]  [<ffffffff81089602>] shrink_zone+0x608/0x979
[44161.734219]  [<ffffffff81009a23>] ? native_sched_clock+0x37/0x71
[44161.734219]  [<ffffffff8104dfb7>] ? up_read+0x1e/0x37
[44161.734219]  [<ffffffff8104a990>] ? autoremove_wake_function+0x0/0x38
[44161.734219]  [<ffffffff8108a5d9>] kswapd+0x477/0x682
[44161.734219]  [<ffffffff81087c5c>] ? isolate_pages_global+0x0/0x209
[44161.734219]  [<ffffffff8104a990>] ? autoremove_wake_function+0x0/0x38
[44161.734219]  [<ffffffff8108a162>] ? kswapd+0x0/0x682
[44161.734219]  [<ffffffff8104a65e>] kthread+0x7d/0x85
[44161.734219]  [<ffffffff810037d4>] kernel_thread_helper+0x4/0x10
[44161.734219]  [<ffffffff81429380>] ? restore_args+0x0/0x30
[44161.734219]  [<ffffffff8104a5e1>] ? kthread+0x0/0x85
[44161.734219]  [<ffffffff810037d0>] ? kernel_thread_helper+0x0/0x10

Related issues 1 (1 open0 closed)

Related to Linux kernel client - Bug #2590: possible irq lock inversion dependency with con->mutex and osdc->request_mutexNew

Actions
Actions #1

Updated by Yehuda Sadeh almost 14 years ago

What it actually means is that sock_alloc_inode is being called under the kswapd context and it does an allocation with GFP_KERNEL. This can deadlock, and we really need it to do it using GFP_NOFS. However, currently there's no clean way to do it.

Actions #2

Updated by Sage Weil almost 14 years ago

We could have a pool of preallocated sockets.. but that could be exhausted.

Or duplicate a bunch of socket creation code to pass the GFP_NOFS.

What does nfs do?

Actions #3

Updated by Yehuda Sadeh almost 14 years ago

nfs uses the rpc code, which, if I understand it correctly initializes a work queue for socket allocation and connect. So it doesn't really fix a possible deadlock but it allows backing off on such scenario after a timeout.

Actions #4

Updated by Sage Weil almost 14 years ago

  • Priority changed from Normal to Low
Actions #5

Updated by Sage Weil almost 14 years ago

  • Target version changed from v2.6.35 to v2.6.36
Actions #6

Updated by Sage Weil over 13 years ago

  • Target version changed from v2.6.36 to v2.6.37
Actions #7

Updated by Sage Weil over 13 years ago

  • Target version deleted (v2.6.37)
Actions #8

Updated by Sage Weil over 12 years ago

  • Translation missing: en.field_position deleted (420)
  • Translation missing: en.field_position set to 412
Actions #9

Updated by Josh Durgin about 12 years ago

This happened in a teuthology run of iozone on rbd. From teuthology:~teuthworker/archive/master-2012-01-25_14:30:/syslog/kern.log:

2012-01-25T16:38:49.916889-08:00 sepia36 kernel: [ 2254.842388] 
2012-01-25T16:38:49.916979-08:00 sepia36 kernel: [ 2254.843906] =========================================================
2012-01-25T16:38:49.916987-08:00 sepia36 kernel: [ 2254.850010] [ INFO: possible irq lock inversion dependency detected ]
2012-01-25T16:38:49.916993-08:00 sepia36 kernel: [ 2254.850010] 3.2.0-ceph-00037-gc68bdb1 #1
2012-01-25T16:38:49.916999-08:00 sepia36 kernel: [ 2254.850010] ---------------------------------------------------------
2012-01-25T16:38:49.917005-08:00 sepia36 kernel: [ 2254.850010] kswapd0/41 just changed the state of lock:
2012-01-25T16:38:49.917014-08:00 sepia36 kernel: [ 2254.850010]  (&osdc->request_mutex){+.+.-.}, at: [<ffffffffa0213efb>] ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-01-25T16:38:49.917020-08:00 sepia36 kernel: [ 2254.850010] but this lock took another, RECLAIM_FS-unsafe lock in the past:
2012-01-25T16:38:49.917025-08:00 sepia36 kernel: [ 2254.850010]  (&con->mutex){+.+.+.}
2012-01-25T16:38:49.917030-08:00 sepia36 kernel: [ 2254.850010] 
2012-01-25T16:38:49.917037-08:00 sepia36 kernel: [ 2254.850010] and interrupts could create inverse lock ordering between them.
2012-01-25T16:38:49.917042-08:00 sepia36 kernel: [ 2254.850010] 
2012-01-25T16:38:49.917047-08:00 sepia36 kernel: [ 2254.850010] 
2012-01-25T16:38:49.917059-08:00 sepia36 kernel: [ 2254.850010] other info that might help us debug this:
2012-01-25T16:38:49.917065-08:00 sepia36 kernel: [ 2254.850010]  Possible interrupt unsafe locking scenario:
2012-01-25T16:38:49.917070-08:00 sepia36 kernel: [ 2254.850010] 
2012-01-25T16:38:49.917075-08:00 sepia36 kernel: [ 2254.850010]        CPU0                    CPU1
2012-01-25T16:38:49.917096-08:00 sepia36 kernel: [ 2254.850010]        ----                    ----
2012-01-25T16:38:49.917107-08:00 sepia36 kernel: [ 2254.850010]   lock(&con->mutex);
2012-01-25T16:38:49.917113-08:00 sepia36 kernel: [ 2254.850010]                                local_irq_disable();
2012-01-25T16:38:49.917119-08:00 sepia36 kernel: [ 2254.850010]                                lock(&osdc->request_mutex);
2012-01-25T16:38:49.917124-08:00 sepia36 kernel: [ 2254.936000]                                lock(&con->mutex);
2012-01-25T16:38:49.917129-08:00 sepia36 kernel: [ 2254.936000]   <Interrupt>
2012-01-25T16:38:49.917139-08:00 sepia36 kernel: [ 2254.936000]     lock(&osdc->request_mutex);
2012-01-25T16:38:49.917144-08:00 sepia36 kernel: [ 2254.936000] 
2012-01-25T16:38:49.917148-08:00 sepia36 kernel: [ 2254.936000]  *** DEADLOCK ***
2012-01-25T16:38:49.917153-08:00 sepia36 kernel: [ 2254.936000] 
2012-01-25T16:38:49.917163-08:00 sepia36 kernel: [ 2254.936000] no locks held by kswapd0/41.
2012-01-25T16:38:49.917168-08:00 sepia36 kernel: [ 2254.936000] 
2012-01-25T16:38:49.917174-08:00 sepia36 kernel: [ 2254.936000] the shortest dependencies between 2nd lock and 1st lock:
2012-01-25T16:38:49.917179-08:00 sepia36 kernel: [ 2254.936000]  -> (&con->mutex){+.+.+.} ops: 595120 {
2012-01-25T16:38:49.917184-08:00 sepia36 kernel: [ 2254.936000]     HARDIRQ-ON-W at:
2012-01-25T16:38:49.917195-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff8109facb>] __lock_acquire+0x5ab/0x15d0
2012-01-25T16:38:49.917202-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.917209-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.917216-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.917227-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffffa020d546>] con_work+0x46/0x1b00 [libceph]
2012-01-25T16:38:49.917234-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-25T16:38:49.917241-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-25T16:38:49.917247-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.917263-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.917269-08:00 sepia36 kernel: [ 2254.936000]     SOFTIRQ-ON-W at:
2012-01-25T16:38:49.917275-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff8109fafd>] __lock_acquire+0x5dd/0x15d0
2012-01-25T16:38:49.917281-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.917293-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.917300-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.917307-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffffa020d546>] con_work+0x46/0x1b00 [libceph]
2012-01-25T16:38:49.917314-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-25T16:38:49.917325-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-25T16:38:49.917332-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.917339-08:00 sepia36 kernel: [ 2254.936000]                                          [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.917344-08:00 sepia36 kernel: [ 2254.936000]     RECLAIM_FS-ON-W at:
2012-01-25T16:38:49.917355-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff8109ee2d>] mark_held_locks+0x7d/0x120
2012-01-25T16:38:49.917362-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff8109ef71>] lockdep_trace_alloc+0xa1/0xe0
2012-01-25T16:38:49.917370-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff81168e4a>] kmem_cache_alloc+0x3a/0x130
2012-01-25T16:38:49.917377-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff814e8d6d>] sock_alloc_inode+0x1d/0xc0
2012-01-25T16:38:49.918116-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff81190c07>] alloc_inode+0x27/0xa0
2012-01-25T16:38:49.918125-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff81190c9b>] new_inode_pseudo+0x1b/0x70
2012-01-25T16:38:49.918161-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff814e9bce>] sock_alloc+0x1e/0x70
2012-01-25T16:38:49.918169-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff814ed376>] __sock_create+0x96/0x360
2012-01-25T16:38:49.918181-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff814ed664>] sock_create_kern+0x24/0x30
2012-01-25T16:38:49.918188-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffffa020d12b>] try_write+0xc4b/0x1020 [libceph]
2012-01-25T16:38:49.918218-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffffa020e110>] con_work+0xc10/0x1b00 [libceph]
2012-01-25T16:38:49.918227-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-25T16:38:49.918239-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-25T16:38:49.918246-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.918253-08:00 sepia36 kernel: [ 2254.936000]                                             [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.918273-08:00 sepia36 kernel: [ 2254.936000]     INITIAL USE at:
2012-01-25T16:38:49.918318-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffff8109f730>] __lock_acquire+0x210/0x15d0
2012-01-25T16:38:49.918326-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.918333-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.918340-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.918352-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffffa020d546>] con_work+0x46/0x1b00 [libceph]
2012-01-25T16:38:49.918395-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-25T16:38:49.918402-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-25T16:38:49.918408-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.918431-08:00 sepia36 kernel: [ 2254.936000]                                         [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.918436-08:00 sepia36 kernel: [ 2254.936000]   }
2012-01-25T16:38:49.918443-08:00 sepia36 kernel: [ 2254.936000]   ... key      at: [<ffffffffa022db00>] __key.62739+0x0/0xffffffffffff21f7 [libceph]
2012-01-25T16:38:49.918448-08:00 sepia36 kernel: [ 2254.936000]   ... acquired at:
2012-01-25T16:38:49.918458-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.918465-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.918471-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.918477-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa020b137>] ceph_con_send+0xa7/0x1c0 [libceph]
2012-01-25T16:38:49.918504-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa0212b2c>] __send_request+0xbc/0x100 [libceph]
2012-01-25T16:38:49.918511-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa0213f83>] ceph_osdc_start_request+0x113/0x190 [libceph]
2012-01-25T16:38:49.918517-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa024ad42>] rbd_do_request+0x232/0x3a0 [rbd]
2012-01-25T16:38:49.918523-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa024b011>] T.954+0xe1/0x1b0 [rbd]
2012-01-25T16:38:49.918529-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa024b17c>] rbd_read_header+0x9c/0x320 [rbd]
2012-01-25T16:38:49.918539-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa024cf20>] rbd_add+0x6d0/0xd30 [rbd]
2012-01-25T16:38:49.918573-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff813e0487>] bus_attr_store+0x27/0x30
2012-01-25T16:38:49.918580-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff811eb616>] sysfs_write_file+0xe6/0x170
2012-01-25T16:38:49.918585-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff811763f8>] vfs_write+0xc8/0x190
2012-01-25T16:38:49.918609-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff811765b1>] sys_write+0x51/0x90
2012-01-25T16:38:49.918616-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff8160e1c2>] system_call_fastpath+0x16/0x1b
2012-01-25T16:38:49.918622-08:00 sepia36 kernel: [ 2254.936000] 
2012-01-25T16:38:49.918659-08:00 sepia36 kernel: [ 2254.936000] -> (&osdc->request_mutex){+.+.-.} ops: 182310 {
2012-01-25T16:38:49.918664-08:00 sepia36 kernel: [ 2254.936000]    HARDIRQ-ON-W at:
2012-01-25T16:38:49.918675-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff8109facb>] __lock_acquire+0x5ab/0x15d0
2012-01-25T16:38:49.918682-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.918713-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.918720-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.918732-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffffa021461e>] kick_requests+0x3e/0x2b0 [libceph]
2012-01-25T16:38:49.919392-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffffa0214d98>] ceph_osdc_handle_map+0x508/0x5f0 [libceph]
2012-01-25T16:38:49.919400-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffffa0210cd7>] dispatch+0x117/0x6a0 [libceph]
2012-01-25T16:38:49.919407-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffffa020eade>] con_work+0x15de/0x1b00 [libceph]
2012-01-25T16:38:49.920437-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-25T16:38:49.920444-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-25T16:38:49.920450-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.920457-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.920510-08:00 sepia36 kernel: [ 2254.936000]    SOFTIRQ-ON-W at:
2012-01-25T16:38:49.920518-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff8109fafd>] __lock_acquire+0x5dd/0x15d0
2012-01-25T16:38:49.920524-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.920543-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.920555-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.920562-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffffa021461e>] kick_requests+0x3e/0x2b0 [libceph]
2012-01-25T16:38:49.920568-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffffa0214d98>] ceph_osdc_handle_map+0x508/0x5f0 [libceph]
2012-01-25T16:38:49.920588-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffffa0210cd7>] dispatch+0x117/0x6a0 [libceph]
2012-01-25T16:38:49.920603-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffffa020eade>] con_work+0x15de/0x1b00 [libceph]
2012-01-25T16:38:49.920609-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-25T16:38:49.920654-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-25T16:38:49.920661-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.920730-08:00 sepia36 kernel: [ 2254.936000]                                        [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.920736-08:00 sepia36 kernel: [ 2254.936000]    IN-RECLAIM_FS-W at:
2012-01-25T16:38:49.920766-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff8109f951>] __lock_acquire+0x431/0x15d0
2012-01-25T16:38:49.920773-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.920785-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.920791-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.920798-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffffa0213efb>] ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-01-25T16:38:49.920859-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffffa024ad42>] rbd_do_request+0x232/0x3a0 [rbd]
2012-01-25T16:38:49.920890-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffffa024b56f>] T.952+0x16f/0x1b0 [rbd]
2012-01-25T16:38:49.920898-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffffa024b9f8>] rbd_rq_fn+0x448/0x580 [rbd]
2012-01-25T16:38:49.920904-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff812ef49e>] __blk_run_queue+0x1e/0x20
2012-01-25T16:38:49.920940-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff812f288e>] queue_unplugged+0x4e/0xc0
2012-01-25T16:38:49.920948-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff812f2af5>] blk_flush_plug_list+0x1f5/0x280
2012-01-25T16:38:49.920955-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff812f2b98>] blk_finish_plug+0x18/0x50
2012-01-25T16:38:49.920961-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff8112f766>] shrink_zone+0x366/0x5d0
2012-01-25T16:38:49.921016-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff81130d85>] balance_pgdat+0x6c5/0x980
2012-01-25T16:38:49.921024-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff8113122d>] kswapd+0x1ed/0x400
2012-01-25T16:38:49.921042-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.921050-08:00 sepia36 kernel: [ 2254.936000]                                           [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.921061-08:00 sepia36 kernel: [ 2254.936000]    INITIAL USE at:
2012-01-25T16:38:49.921786-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffff8109f730>] __lock_acquire+0x210/0x15d0
2012-01-25T16:38:49.921794-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.921800-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.921838-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.921845-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffffa021461e>] kick_requests+0x3e/0x2b0 [libceph]
2012-01-25T16:38:49.921852-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffffa0214d98>] ceph_osdc_handle_map+0x508/0x5f0 [libceph]
2012-01-25T16:38:49.921860-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffffa0210cd7>] dispatch+0x117/0x6a0 [libceph]
2012-01-25T16:38:49.921896-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffffa020eade>] con_work+0x15de/0x1b00 [libceph]
2012-01-25T16:38:49.921904-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-25T16:38:49.921910-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-25T16:38:49.921916-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.921944-08:00 sepia36 kernel: [ 2254.936000]                                       [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.921948-08:00 sepia36 kernel: [ 2254.936000]  }
2012-01-25T16:38:49.921955-08:00 sepia36 kernel: [ 2254.936000]  ... key      at: [<ffffffffa022e7a0>] __key.38096+0x0/0xffffffffffff1557 [libceph]
2012-01-25T16:38:49.921959-08:00 sepia36 kernel: [ 2254.936000]  ... acquired at:
2012-01-25T16:38:49.921965-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff8109dfe0>] check_usage_forwards+0x90/0x100
2012-01-25T16:38:49.921980-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff8109eb54>] mark_lock+0x1b4/0x410
2012-01-25T16:38:49.922002-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff8109f951>] __lock_acquire+0x431/0x15d0
2012-01-25T16:38:49.922007-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.922013-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.922024-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.922030-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa0213efb>] ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-01-25T16:38:49.922052-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa024ad42>] rbd_do_request+0x232/0x3a0 [rbd]
2012-01-25T16:38:49.922058-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa024b56f>] T.952+0x16f/0x1b0 [rbd]
2012-01-25T16:38:49.922078-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa024b9f8>] rbd_rq_fn+0x448/0x580 [rbd]
2012-01-25T16:38:49.922084-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff812ef49e>] __blk_run_queue+0x1e/0x20
2012-01-25T16:38:49.922090-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff812f288e>] queue_unplugged+0x4e/0xc0
2012-01-25T16:38:49.922162-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff812f2af5>] blk_flush_plug_list+0x1f5/0x280
2012-01-25T16:38:49.922168-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff812f2b98>] blk_finish_plug+0x18/0x50
2012-01-25T16:38:49.922180-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff8112f766>] shrink_zone+0x366/0x5d0
2012-01-25T16:38:49.922185-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff81130d85>] balance_pgdat+0x6c5/0x980
2012-01-25T16:38:49.922191-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff8113122d>] kswapd+0x1ed/0x400
2012-01-25T16:38:49.922196-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.922207-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.922078-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffffa024b9f8>] rbd_rq_fn+0x448/0x580 [rbd]
2012-01-25T16:38:49.922084-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff812ef49e>] __blk_run_queue+0x1e/0x20
2012-01-25T16:38:49.922090-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff812f288e>] queue_unplugged+0x4e/0xc0
2012-01-25T16:38:49.922162-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff812f2af5>] blk_flush_plug_list+0x1f5/0x280
2012-01-25T16:38:49.922168-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff812f2b98>] blk_finish_plug+0x18/0x50
2012-01-25T16:38:49.922180-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff8112f766>] shrink_zone+0x366/0x5d0
2012-01-25T16:38:49.922185-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff81130d85>] balance_pgdat+0x6c5/0x980
2012-01-25T16:38:49.922191-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff8113122d>] kswapd+0x1ed/0x400
2012-01-25T16:38:49.922196-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.922207-08:00 sepia36 kernel: [ 2254.936000]    [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.922212-08:00 sepia36 kernel: [ 2254.936000] 
2012-01-25T16:38:49.922216-08:00 sepia36 kernel: [ 2254.936000] 
2012-01-25T16:38:49.922249-08:00 sepia36 kernel: [ 2254.936000] stack backtrace:
2012-01-25T16:38:49.922254-08:00 sepia36 kernel: [ 2254.936000] Pid: 41, comm: kswapd0 Not tainted 3.2.0-ceph-00037-gc68bdb1 #1
2012-01-25T16:38:49.922265-08:00 sepia36 kernel: [ 2254.936000] Call Trace:
2012-01-25T16:38:49.922271-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109df03>] print_irq_inversion_bug+0x1d3/0x220
2012-01-25T16:38:49.922277-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109df50>] ? print_irq_inversion_bug+0x220/0x220
2012-01-25T16:38:49.922309-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109dfe0>] check_usage_forwards+0x90/0x100
2012-01-25T16:38:49.922321-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff810a84f2>] ? __module_text_address+0x12/0x60
2012-01-25T16:38:49.922327-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81605f70>] ? _raw_spin_unlock_irqrestore+0x40/0x70
2012-01-25T16:38:49.922333-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109eb54>] mark_lock+0x1b4/0x410
2012-01-25T16:38:49.922339-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109f951>] __lock_acquire+0x431/0x15d0
2012-01-25T16:38:49.922439-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109ee2d>] ? mark_held_locks+0x7d/0x120
2012-01-25T16:38:49.922451-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff811665f5>] ? __slab_alloc+0x165/0x500
2012-01-25T16:38:49.922458-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109f1a5>] ? trace_hardirqs_on_caller+0x105/0x190
2012-01-25T16:38:49.922464-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa0213efb>] ? ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-01-25T16:38:49.922499-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-25T16:38:49.922509-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa0213efb>] ? ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-01-25T16:38:49.922516-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-25T16:38:49.922522-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa0213efb>] ? ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-01-25T16:38:49.922552-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8116746c>] ? __kmalloc+0x8c/0x160
2012-01-25T16:38:49.923010-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa0213efb>] ? ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-01-25T16:38:49.923067-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa020a81f>] ? ceph_msg_new+0x19f/0x2c0 [libceph]
2012-01-25T16:38:49.923073-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-25T16:38:49.923103-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa0213efb>] ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-01-25T16:38:49.923110-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa024ad42>] rbd_do_request+0x232/0x3a0 [rbd]
2012-01-25T16:38:49.923121-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa024b56f>] T.952+0x16f/0x1b0 [rbd]
2012-01-25T16:38:49.923127-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa024a790>] ? rbd_coll_end_req_index+0x170/0x170 [rbd]
2012-01-25T16:38:49.923133-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81605f20>] ? _raw_spin_unlock_irq+0x30/0x40
2012-01-25T16:38:49.923138-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffffa024b9f8>] rbd_rq_fn+0x448/0x580 [rbd]
2012-01-25T16:38:49.923176-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8130fcdd>] ? rb_insert_color+0x9d/0x160
2012-01-25T16:38:49.923182-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff812ef49e>] __blk_run_queue+0x1e/0x20
2012-01-25T16:38:49.923188-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff812f288e>] queue_unplugged+0x4e/0xc0
2012-01-25T16:38:49.923194-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff812f2af5>] blk_flush_plug_list+0x1f5/0x280
2012-01-25T16:38:49.923204-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff812f2b98>] blk_finish_plug+0x18/0x50
2012-01-25T16:38:49.923210-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8112f766>] shrink_zone+0x366/0x5d0
2012-01-25T16:38:49.923216-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8112104d>] ? zone_watermark_ok_safe+0xad/0xc0
2012-01-25T16:38:49.923221-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81130d85>] balance_pgdat+0x6c5/0x980
2012-01-25T16:38:49.923255-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81087801>] ? abort_exclusive_wait+0x41/0xb0
2012-01-25T16:38:49.923266-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109f23d>] ? trace_hardirqs_on+0xd/0x10
2012-01-25T16:38:49.923272-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8113122d>] kswapd+0x1ed/0x400
2012-01-25T16:38:49.923301-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81602c05>] ? __schedule+0x415/0x940
2012-01-25T16:38:49.923307-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81603017>] ? __schedule+0x827/0x940
2012-01-25T16:38:49.923319-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81087740>] ? wake_up_bit+0x40/0x40
2012-01-25T16:38:49.923326-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81131040>] ? balance_pgdat+0x980/0x980
2012-01-25T16:38:49.923332-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-25T16:38:49.923363-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff8109f1a5>] ? trace_hardirqs_on_caller+0x105/0x190
2012-01-25T16:38:49.923369-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-25T16:38:49.923380-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff816062b4>] ? retint_restore_args+0x13/0x13
2012-01-25T16:38:49.923386-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff81087130>] ? __init_kthread_worker+0x70/0x70
2012-01-25T16:38:49.923392-08:00 sepia36 kernel: [ 2254.936000]  [<ffffffff816103c0>] ? gs_change+0x13/0x13
Actions #10

Updated by Alex Elder about 12 years ago

  • Assignee set to Alex Elder

I'll take this. I may have reordered something in my recent commits
to cause this to surface.

Actions #11

Updated by Sage Weil about 12 years ago

Saw this again. It's been a while..


2012-02-01T03:48:41.112987-08:00 sepia36 kernel: [11530.063225] =========================================================
2012-02-01T03:48:41.112997-08:00 sepia36 kernel: [11530.069679] [ INFO: possible irq lock inversion dependency detected ]
2012-02-01T03:48:41.113029-08:00 sepia36 kernel: [11530.070013] 3.2.0-ceph-00132-gdbdc61e #1
2012-02-01T03:48:41.113041-08:00 sepia36 kernel: [11530.070013] ---------------------------------------------------------
2012-02-01T03:48:41.113047-08:00 sepia36 kernel: [11530.070013] kswapd0/41 just changed the state of lock:
2012-02-01T03:48:41.113056-08:00 sepia36 kernel: [11530.070013]  (&osdc->request_mutex){+.+.-.}, at: [<ffffffffa0366efb>] ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-02-01T03:48:41.113102-08:00 sepia36 kernel: [11530.070013] but this lock took another, RECLAIM_FS-unsafe lock in the past:
2012-02-01T03:48:41.113122-08:00 sepia36 kernel: [11530.070013]  (&con->mutex){+.+.+.}
2012-02-01T03:48:41.113127-08:00 sepia36 kernel: [11530.070013] 
2012-02-01T03:48:41.113132-08:00 sepia36 kernel: [11530.070013] and interrupts could create inverse lock ordering between them.
2012-02-01T03:48:41.113137-08:00 sepia36 kernel: [11530.070013] 
2012-02-01T03:48:41.113147-08:00 sepia36 kernel: [11530.070013] 
2012-02-01T03:48:41.113153-08:00 sepia36 kernel: [11530.070013] other info that might help us debug this:
2012-02-01T03:48:41.113159-08:00 sepia36 kernel: [11530.070013]  Possible interrupt unsafe locking scenario:
2012-02-01T03:48:41.113163-08:00 sepia36 kernel: [11530.070013] 
2012-02-01T03:48:41.113168-08:00 sepia36 kernel: [11530.070013]        CPU0                    CPU1
2012-02-01T03:48:41.113178-08:00 sepia36 kernel: [11530.070013]        ----                    ----
2012-02-01T03:48:41.113184-08:00 sepia36 kernel: [11530.070013]   lock(&con->mutex);
2012-02-01T03:48:41.113189-08:00 sepia36 kernel: [11530.070013]                                local_irq_disable();
2012-02-01T03:48:41.113196-08:00 sepia36 kernel: [11530.070013]                                lock(&osdc->request_mutex);
2012-02-01T03:48:41.113201-08:00 sepia36 kernel: [11530.070013]                                lock(&con->mutex);
2012-02-01T03:48:41.113239-08:00 sepia36 kernel: [11530.070013]   <Interrupt>
2012-02-01T03:48:41.113245-08:00 sepia36 kernel: [11530.070013]     lock(&osdc->request_mutex);
2012-02-01T03:48:41.113250-08:00 sepia36 kernel: [11530.070013] 
2012-02-01T03:48:41.113255-08:00 sepia36 kernel: [11530.070013]  *** DEADLOCK ***
2012-02-01T03:48:41.113263-08:00 sepia36 kernel: [11530.070013] 
2012-02-01T03:48:41.113269-08:00 sepia36 kernel: [11530.070013] no locks held by kswapd0/41.
2012-02-01T03:48:41.113274-08:00 sepia36 kernel: [11530.070013] 
2012-02-01T03:48:41.113309-08:00 sepia36 kernel: [11530.070013] the shortest dependencies between 2nd lock and 1st lock:
2012-02-01T03:48:41.113314-08:00 sepia36 kernel: [11530.070013]  -> (&con->mutex){+.+.+.} ops: 674319 {
2012-02-01T03:48:41.113337-08:00 sepia36 kernel: [11530.070013]     HARDIRQ-ON-W at:
2012-02-01T03:48:41.113345-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff8109facb>] __lock_acquire+0x5ab/0x15d0
2012-02-01T03:48:41.113352-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.113359-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.113371-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.113378-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffffa0360546>] con_work+0x46/0x1b00 [libceph]
2012-02-01T03:48:41.113385-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-02-01T03:48:41.113392-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-02-01T03:48:41.113403-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.113410-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.113514-08:00 sepia36 kernel: [11530.070013]     SOFTIRQ-ON-W at:
2012-02-01T03:48:41.113523-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff8109fafd>] __lock_acquire+0x5dd/0x15d0
2012-02-01T03:48:41.113530-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.113557-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.113565-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.113575-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffffa0360546>] con_work+0x46/0x1b00 [libceph]
2012-02-01T03:48:41.113584-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-02-01T03:48:41.113593-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-02-01T03:48:41.113607-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.113614-08:00 sepia36 kernel: [11530.070013]                                          [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.113619-08:00 sepia36 kernel: [11530.070013]     RECLAIM_FS-ON-W at:
2012-02-01T03:48:41.113626-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff8109ee2d>] mark_held_locks+0x7d/0x120
2012-02-01T03:48:41.113634-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff8109ef71>] lockdep_trace_alloc+0xa1/0xe0
2012-02-01T03:48:41.113676-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff81168e4a>] kmem_cache_alloc+0x3a/0x130
2012-02-01T03:48:41.113683-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff814e8d6d>] sock_alloc_inode+0x1d/0xc0
2012-02-01T03:48:41.114333-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff81190c07>] alloc_inode+0x27/0xa0
2012-02-01T03:48:41.114349-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff81190c9b>] new_inode_pseudo+0x1b/0x70
2012-02-01T03:48:41.114356-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff814e9bce>] sock_alloc+0x1e/0x70
2012-02-01T03:48:41.114380-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff814ed376>] __sock_create+0x96/0x360
2012-02-01T03:48:41.114388-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff814ed664>] sock_create_kern+0x24/0x30
2012-02-01T03:48:41.114396-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffffa036012b>] try_write+0xc4b/0x1020 [libceph]
2012-02-01T03:48:41.114404-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffffa0361110>] con_work+0xc10/0x1b00 [libceph]
2012-02-01T03:48:41.114412-08:00 sepia36 kernel: [11530.070013]                                             [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-02-01T03:48:41.114420-08:00 sepia36 kernel: [11530.485766]                                             [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-02-01T03:48:41.114456-08:00 sepia36 kernel: [11530.485766]                                             [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.114463-08:00 sepia36 kernel: [11530.485766]                                             [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.114469-08:00 sepia36 kernel: [11530.485766]     INITIAL USE at:
2012-02-01T03:48:41.114477-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffff8109f730>] __lock_acquire+0x210/0x15d0
2012-02-01T03:48:41.114498-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.114506-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.114513-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.114525-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffffa0360546>] con_work+0x46/0x1b00 [libceph]
2012-02-01T03:48:41.114532-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-02-01T03:48:41.114539-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-02-01T03:48:41.114578-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.114594-08:00 sepia36 kernel: [11530.485766]                                         [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.114601-08:00 sepia36 kernel: [11530.485766]   }
2012-02-01T03:48:41.114623-08:00 sepia36 kernel: [11530.485766]   ... key      at: [<ffffffffa0380b00>] __key.62739+0x0/0xffffffffffff21f7 [libceph]
2012-02-01T03:48:41.114629-08:00 sepia36 kernel: [11530.485766]   ... acquired at:
2012-02-01T03:48:41.114635-08:00 sepia36 kernel: [11530.485766]    [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.114646-08:00 sepia36 kernel: [11530.485766]    [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.114653-08:00 sepia36 kernel: [11530.485766]    [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.114659-08:00 sepia36 kernel: [11530.485766]    [<ffffffffa035e137>] ceph_con_send+0xa7/0x1c0 [libceph]
2012-02-01T03:48:41.114666-08:00 sepia36 kernel: [11530.485766]    [<ffffffffa0365b2c>] __send_request+0xbc/0x100 [libceph]
2012-02-01T03:48:41.114687-08:00 sepia36 kernel: [11530.485766]    [<ffffffffa0366f83>] ceph_osdc_start_request+0x113/0x190 [libceph]
2012-02-01T03:48:41.114699-08:00 sepia36 kernel: [11530.485766]    [<ffffffffa039a3c3>] ceph_writepages_start+0x693/0xf40 [ceph]
2012-02-01T03:48:41.114706-08:00 sepia36 kernel: [11530.485766]    [<ffffffff81126634>] do_writepages+0x24/0x40
2012-02-01T03:48:41.114712-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8119f29a>] writeback_single_inode+0x19a/0x420
2012-02-01T03:48:41.114718-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8119f92e>] writeback_sb_inodes+0x1ae/0x280
2012-02-01T03:48:41.114724-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8119fa9e>] __writeback_inodes_wb+0x9e/0xd0
2012-02-01T03:48:41.114770-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8119fdd3>] wb_writeback+0x1f3/0x2d0
2012-02-01T03:48:41.114776-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8119ff4f>] wb_do_writeback+0x9f/0x250
2012-02-01T03:48:41.114782-08:00 sepia36 kernel: [11530.485766]    [<ffffffff811a01b2>] bdi_writeback_thread+0xb2/0x270
2012-02-01T03:48:41.114788-08:00 sepia36 kernel: [11530.485766]    [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.114813-08:00 sepia36 kernel: [11530.485766]    [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.114818-08:00 sepia36 kernel: [11530.485766] 
2012-02-01T03:48:41.114824-08:00 sepia36 kernel: [11530.485766] -> (&osdc->request_mutex){+.+.-.} ops: 276526 {
2012-02-01T03:48:41.114829-08:00 sepia36 kernel: [11530.485766]    HARDIRQ-ON-W at:
2012-02-01T03:48:41.114835-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff8109facb>] __lock_acquire+0x5ab/0x15d0
2012-02-01T03:48:41.114846-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.114887-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.114894-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.115492-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffffa036761e>] kick_requests+0x3e/0x2b0 [libceph]
2012-02-01T03:48:41.115506-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffffa0367d98>] ceph_osdc_handle_map+0x508/0x5f0 [libceph]
2012-02-01T03:48:41.115514-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffffa0363cd7>] dispatch+0x117/0x6a0 [libceph]
2012-02-01T03:48:41.115521-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffffa0361ade>] con_work+0x15de/0x1b00 [libceph]
2012-02-01T03:48:41.115563-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-02-01T03:48:41.115580-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-02-01T03:48:41.115587-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.115611-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.115617-08:00 sepia36 kernel: [11530.485766]    SOFTIRQ-ON-W at:
2012-02-01T03:48:41.115628-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff8109fafd>] __lock_acquire+0x5dd/0x15d0
2012-02-01T03:48:41.115635-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.115642-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.115662-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.115675-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffffa036761e>] kick_requests+0x3e/0x2b0 [libceph]
2012-02-01T03:48:41.115682-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffffa0367d98>] ceph_osdc_handle_map+0x508/0x5f0 [libceph]
2012-02-01T03:48:41.115688-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffffa0363cd7>] dispatch+0x117/0x6a0 [libceph]
2012-02-01T03:48:41.115695-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffffa0361ade>] con_work+0x15de/0x1b00 [libceph]
2012-02-01T03:48:41.115722-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-02-01T03:48:41.115729-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-02-01T03:48:41.115736-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.115742-08:00 sepia36 kernel: [11530.485766]                                        [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.115752-08:00 sepia36 kernel: [11530.485766]    IN-RECLAIM_FS-W at:
2012-02-01T03:48:41.115759-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff8109f951>] __lock_acquire+0x431/0x15d0
2012-02-01T03:48:41.115766-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.115812-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.115825-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.115833-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffffa0366efb>] ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-02-01T03:48:41.115858-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffffa0109332>] rbd_do_request+0x232/0x3a0 [rbd]
2012-02-01T03:48:41.115870-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffffa010960f>] T.952+0x16f/0x1b0 [rbd]
2012-02-01T03:48:41.115877-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffffa0109a98>] rbd_rq_fn+0x448/0x580 [rbd]
2012-02-01T03:48:41.115884-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff812ef49e>] __blk_run_queue+0x1e/0x20
2012-02-01T03:48:41.115904-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff812f288e>] queue_unplugged+0x4e/0xc0
2012-02-01T03:48:41.115920-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff812f2af5>] blk_flush_plug_list+0x1f5/0x280
2012-02-01T03:48:41.115928-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff812f2b98>] blk_finish_plug+0x18/0x50
2012-02-01T03:48:41.115935-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff8112f766>] shrink_zone+0x366/0x5d0
2012-02-01T03:48:41.115942-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff81130d85>] balance_pgdat+0x6c5/0x980
2012-02-01T03:48:41.115952-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff8113122d>] kswapd+0x1ed/0x400
2012-02-01T03:48:41.115978-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.115986-08:00 sepia36 kernel: [11530.485766]                                           [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.116600-08:00 sepia36 kernel: [11530.485766]    INITIAL USE at:
2012-02-01T03:48:41.116613-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffff8109f730>] __lock_acquire+0x210/0x15d0
2012-02-01T03:48:41.116621-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.116628-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.116650-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.116661-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffffa036761e>] kick_requests+0x3e/0x2b0 [libceph]
2012-02-01T03:48:41.116669-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffffa0367d98>] ceph_osdc_handle_map+0x508/0x5f0 [libceph]
2012-02-01T03:48:41.116676-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffffa0363cd7>] dispatch+0x117/0x6a0 [libceph]
2012-02-01T03:48:41.116682-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffffa0361ade>] con_work+0x15de/0x1b00 [libceph]
2012-02-01T03:48:41.116693-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-02-01T03:48:41.116715-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-02-01T03:48:41.116722-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.116729-08:00 sepia36 kernel: [11530.485766]                                       [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.116738-08:00 sepia36 kernel: [11530.485766]  }
2012-02-01T03:48:41.116746-08:00 sepia36 kernel: [11530.485766]  ... key      at: [<ffffffffa03817a0>] __key.38096+0x0/0xffffffffffff1557 [libceph]
2012-02-01T03:48:41.116751-08:00 sepia36 kernel: [11530.485766]  ... acquired at:
2012-02-01T03:48:41.116770-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8109dfe0>] check_usage_forwards+0x90/0x100
2012-02-01T03:48:41.116776-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8109eb54>] mark_lock+0x1b4/0x410
2012-02-01T03:48:41.116787-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8109f951>] __lock_acquire+0x431/0x15d0
2012-02-01T03:48:41.116793-08:00 sepia36 kernel: [11530.485766]    [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.116800-08:00 sepia36 kernel: [11530.485766]    [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.116806-08:00 sepia36 kernel: [11530.485766]    [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.116850-08:00 sepia36 kernel: [11530.485766]    [<ffffffffa0366efb>] ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-02-01T03:48:41.116858-08:00 sepia36 kernel: [11530.485766]    [<ffffffffa0109332>] rbd_do_request+0x232/0x3a0 [rbd]
2012-02-01T03:48:41.116864-08:00 sepia36 kernel: [11530.485766]    [<ffffffffa010960f>] T.952+0x16f/0x1b0 [rbd]
2012-02-01T03:48:41.116870-08:00 sepia36 kernel: [11530.485766]    [<ffffffffa0109a98>] rbd_rq_fn+0x448/0x580 [rbd]
2012-02-01T03:48:41.116893-08:00 sepia36 kernel: [11530.485766]    [<ffffffff812ef49e>] __blk_run_queue+0x1e/0x20
2012-02-01T03:48:41.116901-08:00 sepia36 kernel: [11530.485766]    [<ffffffff812f288e>] queue_unplugged+0x4e/0xc0
2012-02-01T03:48:41.116907-08:00 sepia36 kernel: [11530.485766]    [<ffffffff812f2af5>] blk_flush_plug_list+0x1f5/0x280
2012-02-01T03:48:41.116913-08:00 sepia36 kernel: [11530.485766]    [<ffffffff812f2b98>] blk_finish_plug+0x18/0x50
2012-02-01T03:48:41.116918-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8112f766>] shrink_zone+0x366/0x5d0
2012-02-01T03:48:41.116949-08:00 sepia36 kernel: [11530.485766]    [<ffffffff81130d85>] balance_pgdat+0x6c5/0x980
2012-02-01T03:48:41.116957-08:00 sepia36 kernel: [11530.485766]    [<ffffffff8113122d>] kswapd+0x1ed/0x400
2012-02-01T03:48:41.116963-08:00 sepia36 kernel: [11530.485766]    [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.116969-08:00 sepia36 kernel: [11530.485766]    [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.116973-08:00 sepia36 kernel: [11530.485766] 
2012-02-01T03:48:41.116982-08:00 sepia36 kernel: [11530.485766] 
2012-02-01T03:48:41.116987-08:00 sepia36 kernel: [11530.485766] stack backtrace:
2012-02-01T03:48:41.116993-08:00 sepia36 kernel: [11530.485766] Pid: 41, comm: kswapd0 Not tainted 3.2.0-ceph-00132-gdbdc61e #1
2012-02-01T03:48:41.117011-08:00 sepia36 kernel: [11530.485766] Call Trace:
2012-02-01T03:48:41.117023-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109df03>] print_irq_inversion_bug+0x1d3/0x220
2012-02-01T03:48:41.117029-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109df50>] ? print_irq_inversion_bug+0x220/0x220
2012-02-01T03:48:41.117036-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109dfe0>] check_usage_forwards+0x90/0x100
2012-02-01T03:48:41.117042-08:00 sepia36 kernel: [11530.485766]  [<ffffffff810a84f2>] ? __module_text_address+0x12/0x60
2012-02-01T03:48:41.117047-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109eb54>] mark_lock+0x1b4/0x410
2012-02-01T03:48:41.117058-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109f951>] __lock_acquire+0x431/0x15d0
2012-02-01T03:48:41.117075-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109ee2d>] ? mark_held_locks+0x7d/0x120
2012-02-01T03:48:41.117083-08:00 sepia36 kernel: [11530.485766]  [<ffffffff811665f5>] ? __slab_alloc+0x165/0x500
2012-02-01T03:48:41.117089-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109f1a5>] ? trace_hardirqs_on_caller+0x105/0x190
2012-02-01T03:48:41.117100-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa0366efb>] ? ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-02-01T03:48:41.117106-08:00 sepia36 kernel: [11530.485766]  [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-02-01T03:48:41.117113-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa0366efb>] ? ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-02-01T03:48:41.117119-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-02-01T03:48:41.117143-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa0366efb>] ? ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-02-01T03:48:41.117150-08:00 sepia36 kernel: [11530.485766]  [<ffffffff811674d1>] ? __kmalloc+0xf1/0x160
2012-02-01T03:48:41.117397-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa0366efb>] ? ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-02-01T03:48:41.117404-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa035d81f>] ? ceph_msg_new+0x19f/0x2c0 [libceph]
2012-02-01T03:48:41.117410-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-02-01T03:48:41.117421-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa0366efb>] ceph_osdc_start_request+0x8b/0x190 [libceph]
2012-02-01T03:48:41.117427-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa0109332>] rbd_do_request+0x232/0x3a0 [rbd]
2012-02-01T03:48:41.117433-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa010960f>] T.952+0x16f/0x1b0 [rbd]
2012-02-01T03:48:41.117450-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa0108890>] ? rbd_coll_end_req_index+0x160/0x160 [rbd]
2012-02-01T03:48:41.117461-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81605f20>] ? _raw_spin_unlock_irq+0x30/0x40
2012-02-01T03:48:41.117467-08:00 sepia36 kernel: [11530.485766]  [<ffffffffa0109a98>] rbd_rq_fn+0x448/0x580 [rbd]
2012-02-01T03:48:41.117473-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8130fd65>] ? rb_insert_color+0x125/0x160
2012-02-01T03:48:41.117479-08:00 sepia36 kernel: [11530.485766]  [<ffffffff812ef49e>] __blk_run_queue+0x1e/0x20
2012-02-01T03:48:41.117500-08:00 sepia36 kernel: [11530.485766]  [<ffffffff812f288e>] queue_unplugged+0x4e/0xc0
2012-02-01T03:48:41.117507-08:00 sepia36 kernel: [11530.485766]  [<ffffffff812f2af5>] blk_flush_plug_list+0x1f5/0x280
2012-02-01T03:48:41.117513-08:00 sepia36 kernel: [11530.485766]  [<ffffffff812f2b98>] blk_finish_plug+0x18/0x50
2012-02-01T03:48:41.117518-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8112f766>] shrink_zone+0x366/0x5d0
2012-02-01T03:48:41.117524-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8112104d>] ? zone_watermark_ok_safe+0xad/0xc0
2012-02-01T03:48:41.117535-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81130d85>] balance_pgdat+0x6c5/0x980
2012-02-01T03:48:41.117541-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81087801>] ? abort_exclusive_wait+0x41/0xb0
2012-02-01T03:48:41.117547-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109f23d>] ? trace_hardirqs_on+0xd/0x10
2012-02-01T03:48:41.117553-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8113122d>] kswapd+0x1ed/0x400
2012-02-01T03:48:41.117563-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81602c05>] ? __schedule+0x415/0x940
2012-02-01T03:48:41.117570-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81603017>] ? __schedule+0x827/0x940
2012-02-01T03:48:41.117575-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81087740>] ? wake_up_bit+0x40/0x40
2012-02-01T03:48:41.117581-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81131040>] ? balance_pgdat+0x980/0x980
2012-02-01T03:48:41.117587-08:00 sepia36 kernel: [11530.485766]  [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-02-01T03:48:41.117598-08:00 sepia36 kernel: [11530.485766]  [<ffffffff8109f1a5>] ? trace_hardirqs_on_caller+0x105/0x190
2012-02-01T03:48:41.117604-08:00 sepia36 kernel: [11530.485766]  [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-02-01T03:48:41.117610-08:00 sepia36 kernel: [11530.485766]  [<ffffffff816062b4>] ? retint_restore_args+0x13/0x13
2012-02-01T03:48:41.117615-08:00 sepia36 kernel: [11530.485766]  [<ffffffff81087130>] ? __init_kthread_worker+0x70/0x70
2012-02-01T03:48:41.117626-08:00 sepia36 kernel: [11530.485766]  [<ffffffff816103c0>] ? gs_change+0x13/0x13

Actions #12

Updated by Sage Weil almost 12 years ago

ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7579
ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7587

- chef: null
- ceph:
    log-whitelist:
    - wrongly marked me down or wrong addr
    - objects unfound and apparently lost
- thrashosds: null
- kclient: null
- workunit:
    clients:
      all:
      - suites/ffsb.sh
Actions #13

Updated by Alex Elder almost 12 years ago

I suppose this really ought to get fixed at some point.
For now, it looks like Sage has implemented a workaround
that makes this no longer appear to be a problem.

Actions #14

Updated by Ian Colle about 11 years ago

  • Assignee deleted (Alex Elder)
Actions #15

Updated by Alex Elder almost 11 years ago

It's possible this problem is due to rbd possibly starting
requests while in irq context. It may be further complicated
by the attempt to connect the socket while in try_write()
(during which time con->mutex is held).

I.e., perhaps allocating a socket requires taking a lock in
while allocating a file-like resource or something. But meanwhile
the rbd request function might get entered while holding a
related lock (held by whoever is submitting I/O).

I don't know. I had a theory on this but as I tried to
explain it the theory kind of fell apart...

Actions #16

Updated by Sage Weil almost 11 years ago

iirc the fundamental problem is that we create sockets, which allocate an inode via create_inode(), and we can't pass down GFP_NOFS.

Actions #17

Updated by Ilya Dryomov about 5 years ago

  • Category set to libceph
  • Status changed from New to Resolved
  • Assignee set to Ilya Dryomov

Sage Weil wrote:

iirc the fundamental problem is that we create sockets, which allocate an inode via create_inode(), and we can't pass down GFP_NOFS.

Fixed with https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=633ee407b9d15a75ac9740ba9d3338815e1fcb95.

Also, osdc->request_mutex was removed as part of OSD client rewrite in 4.7.

Actions

Also available in: Atom PDF