Actions
Bug #4706
closedkclient: Oops when two clients concurrently write a file
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
[ 229.868015] Modules linked in: netconsole ceph libceph libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle bridge lockd sunrpc bnep bluetooth stp llc rfkill be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi virtio_net pcspkr microcode virtio_balloon uinput virtio_blk [ 229.868015] CPU 1 [ 229.868015] Pid: 50, comm: kworker/1:2 Tainted: G D 3.8.0+ #1 Bochs Bochs [ 229.868015] RIP: 0010:[<ffffffff81084540>] [<ffffffff81084540>] kthread_data+0x10/0x20 [ 229.868015] RSP: 0018:ffff88003711b528 EFLAGS: 00010092 [ 229.868015] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000000000000e [ 229.868015] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff880037120000 [ 229.868015] RBP: ffff88003711b528 R08: ffff880037120070 R09: 0000000000000000 [ 229.868015] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88003fd14800 [ 229.868015] R13: 0000000000000001 R14: ffff88003711fff0 R15: ffff880037120000 [ 229.868015] FS: 0000000000000000(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000 [ 229.868015] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 229.868015] CR2: ffffffffffffffa8 CR3: 000000003ce33000 CR4: 00000000000006e0 [ 229.868015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 229.868015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 229.868015] Process kworker/1:2 (pid: 50, threadinfo ffff88003711a000, task ffff880037120000) [ 229.868015] Stack: [ 229.868015] ffff88003711b548 ffffffff8107f375 ffff88003711b548 ffff8800371203d0 [ 229.868015] ffff88003711b5b8 ffffffff81677802 ffff880037120000 ffff88003711bfd8 [ 229.868015] ffff88003711bfd8 ffff88003711bfd8 ffff88003711b340 ffff880037120000 [ 229.868015] Call Trace: [ 229.868015] [<ffffffff8107f375>] wq_worker_sleeping+0x15/0xc0 [ 229.868015] [<ffffffff81677802>] __schedule+0x5e2/0x800 [ 229.868015] [<ffffffff81677d49>] schedule+0x29/0x70 [ 229.868015] [<ffffffff810649f2>] do_exit+0x6a2/0x9f0 [ 229.868015] [<ffffffff8167a8ed>] oops_end+0x9d/0xe0 [ 229.868015] [<ffffffff8166d4e6>] no_context+0x253/0x27e [ 229.868015] [<ffffffff81312962>] ? put_dec+0x72/0x90 [ 229.868015] [<ffffffff8166d6dc>] __bad_area_nosemaphore+0x1cb/0x1ea [ 229.868015] [<ffffffff8166d70e>] bad_area_nosemaphore+0x13/0x15 [ 229.868015] [<ffffffff8167d70e>] __do_page_fault+0x36e/0x500 [ 229.868015] [<ffffffff81314c94>] ? vsnprintf+0x354/0x640 [ 229.868015] [<ffffffff81314fc0>] ? sprintf+0x40/0x50 [ 229.868015] [<ffffffff8167d8ae>] do_page_fault+0xe/0x10 [ 229.868015] [<ffffffff8167d025>] do_async_page_fault+0x35/0x90 [ 229.868015] [<ffffffff81679c78>] async_page_fault+0x28/0x30 [ 229.868015] [<ffffffff810c1bd1>] ? __lock_acquire+0x61/0x1dc0 [ 229.868015] [<ffffffff8166dc0b>] ? printk+0x61/0x63 [ 229.868015] [<ffffffff810c3ef1>] lock_acquire+0xa1/0x120 [ 229.868015] [<ffffffffa03260ff>] ? sync_write_commit+0x4f/0xb0 [ceph] [ 229.868015] [<ffffffff81678c81>] _raw_spin_lock+0x31/0x40 [ 229.868015] [<ffffffffa03260ff>] ? sync_write_commit+0x4f/0xb0 [ceph] [ 229.868015] [<ffffffffa03260ff>] sync_write_commit+0x4f/0xb0 [ceph] [ 229.868015] [<ffffffffa02e0a81>] complete_request+0x21/0x40 [libceph] [ 229.868015] [<ffffffffa02e5364>] dispatch+0x6b4/0x920 [libceph] [ 229.868015] [<ffffffff81676b1b>] ? __mutex_unlock_slowpath+0xdb/0x170 [ 229.868015] [<ffffffffa02dbbb8>] con_work+0x1428/0x2e00 [libceph] [ 229.868015] [<ffffffff810c1f8a>] ? __lock_acquire+0x41a/0x1dc0 [ 229.868015] [<ffffffff8107c25b>] ? process_one_work+0x13b/0x550 [ 229.868015] [<ffffffff8107c2c1>] process_one_work+0x1a1/0x550 [ 229.868015] [<ffffffff8107c25b>] ? process_one_work+0x13b/0x550 [ 229.868015] [<ffffffffa02da790>] ? ceph_con_close+0xd0/0xd0 [libceph] [ 229.868015] [<ffffffff8107ea9e>] worker_thread+0x15e/0x440 [ 229.868015] [<ffffffff8107e940>] ? busy_worker_rebind_fn+0x100/0x100 [ 229.868015] [<ffffffff810843fa>] kthread+0xea/0xf0 [ 229.868015] [<ffffffff81084310>] ? flush_kthread_work+0x1b0/0x1b0 [ 229.868015] [<ffffffff81681f6c>] ret_from_fork+0x7c/0xb0 [ 229.868015] [<ffffffff81084310>] ? flush_kthread_work+0x1b0/0x1b0
Got above Oops when doing concurrent write with current "testing" branch.
When two clients write data to a file at the same time, they do sync write
even the file is not opened in sync mode. I think the issue is new, it can
be reproduced by running following command on two kclients.
dd if=/dev/zero bs=4k conv=notrunc of=test1
Files
Actions