Project

General

Profile

Actions

Bug #1940

closed

locking cycle in ceph_osdc_start_request

Added by Sage Weil over 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
fs/ceph
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description


2012-01-13T21:41:32.054658-08:00 sepia18 kernel: [ 7624.935762] ======================================================
2012-01-13T21:41:32.054666-08:00 sepia18 kernel: [ 7624.942024] [ INFO: possible circular locking dependency detected ]
2012-01-13T21:41:32.054669-08:00 sepia18 kernel: [ 7624.943851] 3.2.0-ceph-00014-g28fe722 #1
2012-01-13T21:41:32.054672-08:00 sepia18 kernel: [ 7624.943851] -------------------------------------------------------
2012-01-13T21:41:32.054676-08:00 sepia18 kernel: [ 7624.943851] fixdep/10660 is trying to acquire lock:
2012-01-13T21:41:32.054681-08:00 sepia18 kernel: [ 7624.943851]  (&osdc->map_sem){++++..}, at: [<ffffffffa02bce56>] ceph_osdc_start_request+0xa6/0x190 [libceph]
2012-01-13T21:41:32.054684-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.054687-08:00 sepia18 kernel: [ 7624.943851] but task is already holding lock:
2012-01-13T21:41:32.054691-08:00 sepia18 kernel: [ 7624.943851]  (&mm->mmap_sem){++++++}, at: [<ffffffff81609a3e>] do_page_fault+0xfe/0x4b0
2012-01-13T21:41:32.054695-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.054698-08:00 sepia18 kernel: [ 7624.943851] which lock already depends on the new lock.
2012-01-13T21:41:32.054701-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.054703-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.054706-08:00 sepia18 kernel: [ 7624.943851] the existing dependency chain (in reverse order) is:
2012-01-13T21:41:32.054709-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.054721-08:00 sepia18 kernel: [ 7624.943851] -> #3 (&mm->mmap_sem){++++++}:
2012-01-13T21:41:32.054725-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-13T21:41:32.054728-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff8113e112>] might_fault+0x72/0xa0
2012-01-13T21:41:32.054731-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81188e0e>] filldir+0x7e/0xe0
2012-01-13T21:41:32.054735-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff8119b8cb>] dcache_readdir+0x5b/0x260
2012-01-13T21:41:32.054742-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81189010>] vfs_readdir+0xc0/0xe0
2012-01-13T21:41:32.054745-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81189199>] sys_getdents+0x89/0xf0
2012-01-13T21:41:32.054749-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff8160e1c2>] system_call_fastpath+0x16/0x1b
2012-01-13T21:41:32.054751-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.054757-08:00 sepia18 kernel: [ 7624.943851] -> #2 (&sb->s_type->i_mutex_key#2){+.+.+.}:
2012-01-13T21:41:32.054761-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-13T21:41:32.054764-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-13T21:41:32.054768-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-13T21:41:32.054771-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff8129150c>] debugfs_create_file+0x9c/0x2a0
2012-01-13T21:41:32.054777-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff812917ab>] debugfs_create_dir+0x1b/0x20
2012-01-13T21:41:32.054781-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02c3a79>] ceph_debugfs_client_init+0x59/0x160 [libceph]
2012-01-13T21:41:32.054785-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02b286b>] ceph_check_fsid+0x8b/0xa0 [libceph]
2012-01-13T21:41:32.054788-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02b9ec2>] dispatch+0x392/0x670 [libceph]
2012-01-13T21:41:32.054794-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02b7a9e>] con_work+0x15de/0x1b00 [libceph]
2012-01-13T21:41:32.054798-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-13T21:41:32.054801-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-13T21:41:32.054814-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-13T21:41:32.054820-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-13T21:41:32.054824-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.054827-08:00 sepia18 kernel: [ 7624.943851] -> #1 (&monc->mutex){+.+.+.}:
2012-01-13T21:41:32.054830-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-13T21:41:32.054833-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81604090>] __mutex_lock_common+0x50/0x3d0
2012-01-13T21:41:32.054840-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81604537>] mutex_lock_nested+0x37/0x50
2012-01-13T21:41:32.054844-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02b896d>] ceph_monc_got_osdmap+0x2d/0x60 [libceph]
2012-01-13T21:41:32.054847-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02bda32>] ceph_osdc_handle_map+0x282/0x560 [libceph]
2012-01-13T21:41:32.054851-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02b9c47>] dispatch+0x117/0x670 [libceph]
2012-01-13T21:41:32.054911-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02b7a9e>] con_work+0x15de/0x1b00 [libceph]
2012-01-13T21:41:32.054916-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff8107fe76>] process_one_work+0x1a6/0x520
2012-01-13T21:41:32.054919-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810821a3>] worker_thread+0x173/0x400
2012-01-13T21:41:32.054922-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810871e6>] kthread+0xb6/0xc0
2012-01-13T21:41:32.054929-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff816103c4>] kernel_thread_helper+0x4/0x10
2012-01-13T21:41:32.054932-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.054935-08:00 sepia18 kernel: [ 7624.943851] -> #0 (&osdc->map_sem){++++..}:
2012-01-13T21:41:32.054939-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810a09dd>] __lock_acquire+0x14bd/0x15d0
2012-01-13T21:41:32.054942-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-13T21:41:32.054949-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff816048e9>] down_read+0x39/0x50
2012-01-13T21:41:32.054952-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02bce56>] ceph_osdc_start_request+0xa6/0x190 [libceph]
2012-01-13T21:41:32.054956-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffffa02ee466>] ceph_readpages+0x396/0x520 [ceph]
2012-01-13T21:41:32.054959-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81127960>] __do_page_cache_readahead+0x210/0x2d0
2012-01-13T21:41:32.054969-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81127a41>] ra_submit+0x21/0x30
2012-01-13T21:41:32.054973-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff8111d424>] filemap_fault+0x364/0x440
2012-01-13T21:41:32.054977-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81142462>] __do_fault+0x72/0x500
2012-01-13T21:41:32.055515-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81142985>] handle_pte_fault+0x95/0x9d0
2012-01-13T21:41:32.055579-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81143401>] handle_mm_fault+0x141/0x1e0
2012-01-13T21:41:32.055589-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81609a96>] do_page_fault+0x156/0x4b0
2012-01-13T21:41:32.055592-08:00 sepia18 kernel: [ 7624.943851]        [<ffffffff81606535>] page_fault+0x25/0x30
2012-01-13T21:41:32.055595-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.055598-08:00 sepia18 kernel: [ 7624.943851] other info that might help us debug this:
2012-01-13T21:41:32.055612-08:00 sepia18 kernel: [ 7624.943851] 

2012-01-13T21:41:32.055616-08:00 sepia18 kernel: [ 7624.943851] Chain exists of:
2012-01-13T21:41:32.055619-08:00 sepia18 kernel: [ 7624.943851]   &osdc->map_sem --> &sb->s_type->i_mutex_key --> &mm->mmap_sem
2012-01-13T21:41:32.055622-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.055625-08:00 sepia18 kernel: [ 7624.943851]  Possible unsafe locking scenario:
2012-01-13T21:41:32.055631-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.055634-08:00 sepia18 kernel: [ 7624.943851]        CPU0                    CPU1
2012-01-13T21:41:32.055637-08:00 sepia18 kernel: [ 7624.943851]        ----                    ----
2012-01-13T21:41:32.055640-08:00 sepia18 kernel: [ 7624.943851]   lock(&mm->mmap_sem);
2012-01-13T21:41:32.055643-08:00 sepia18 kernel: [ 7624.943851]                                lock(&sb->s_type->i_mutex_key);
2012-01-13T21:41:32.055649-08:00 sepia18 kernel: [ 7624.943851]                                lock(&mm->mmap_sem);
2012-01-13T21:41:32.055652-08:00 sepia18 kernel: [ 7624.943851]   lock(&osdc->map_sem);
2012-01-13T21:41:32.055655-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.055658-08:00 sepia18 kernel: [ 7624.943851]  *** DEADLOCK ***
2012-01-13T21:41:32.055663-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.055667-08:00 sepia18 kernel: [ 7624.943851] 1 lock held by fixdep/10660:
2012-01-13T21:41:32.055670-08:00 sepia18 kernel: [ 7624.943851]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff81609a3e>] do_page_fault+0xfe/0x4b0
2012-01-13T21:41:32.055673-08:00 sepia18 kernel: [ 7624.943851] 
2012-01-13T21:41:32.055676-08:00 sepia18 kernel: [ 7624.943851] stack backtrace:
2012-01-13T21:41:32.055682-08:00 sepia18 kernel: [ 7624.943851] Pid: 10660, comm: fixdep Not tainted 3.2.0-ceph-00014-g28fe722 #1
2012-01-13T21:41:32.055686-08:00 sepia18 kernel: [ 7624.943851] Call Trace:
2012-01-13T21:41:32.055689-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff8109dc4a>] print_circular_bug+0x21a/0x300
2012-01-13T21:41:32.055693-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff810a09dd>] __lock_acquire+0x14bd/0x15d0
2012-01-13T21:41:32.055696-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff8109f1a5>] ? trace_hardirqs_on_caller+0x105/0x190
2012-01-13T21:41:32.055703-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffffa02bce56>] ? ceph_osdc_start_request+0xa6/0x190 [libceph]
2012-01-13T21:41:32.055706-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff810a10e2>] lock_acquire+0xa2/0x120
2012-01-13T21:41:32.055710-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffffa02bce56>] ? ceph_osdc_start_request+0xa6/0x190 [libceph]
2012-01-13T21:41:32.055713-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff816048e9>] down_read+0x39/0x50
2012-01-13T21:41:32.055720-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffffa02bce56>] ? ceph_osdc_start_request+0xa6/0x190 [libceph]
2012-01-13T21:41:32.055724-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffffa02bce56>] ceph_osdc_start_request+0xa6/0x190 [libceph]
2012-01-13T21:41:32.055728-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffffa02ee466>] ceph_readpages+0x396/0x520 [ceph]
2012-01-13T21:41:32.055731-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81127960>] __do_page_cache_readahead+0x210/0x2d0
2012-01-13T21:41:32.055738-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff8112780e>] ? __do_page_cache_readahead+0xbe/0x2d0
2012-01-13T21:41:32.055741-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff8111bf10>] ? sys_readahead+0xc0/0xc0
2012-01-13T21:41:32.055745-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81127a41>] ra_submit+0x21/0x30
2012-01-13T21:41:32.055748-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff8111d424>] filemap_fault+0x364/0x440
2012-01-13T21:41:32.055751-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81142462>] __do_fault+0x72/0x500
2012-01-13T21:41:32.055758-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81146107>] ? vma_link+0x57/0xe0
2012-01-13T21:41:32.055762-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81142985>] handle_pte_fault+0x95/0x9d0
2012-01-13T21:41:32.055765-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff8116ddca>] ? mem_cgroup_count_vm_event+0x1a/0xb0
2012-01-13T21:41:32.055768-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81143401>] handle_mm_fault+0x141/0x1e0
2012-01-13T21:41:32.055774-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81609a96>] do_page_fault+0x156/0x4b0
2012-01-13T21:41:32.055778-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81148804>] ? do_mmap_pgoff+0x354/0x3a0
2012-01-13T21:41:32.055782-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81315d5d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
2012-01-13T21:41:32.055785-08:00 sepia18 kernel: [ 7624.943851]  [<ffffffff81606535>] page_fault+0x25/0x30

workload is kernel_untar_build.sh workunit.
Actions #1

Updated by Sage Weil over 12 years ago

  • Status changed from New to 7

this was causing teuthology runs to fail.

patch in master, testing!

Actions #2

Updated by Sage Weil about 12 years ago

  • Status changed from 7 to Resolved

commit:ab434b60ab07f8c44246b6fb0cddee436687a09a

Actions

Also available in: Atom PDF