Project

General

Profile

Actions

Bug #13643

open

task kworker/u128:1:11460 blocked for more than 120 seconds. with rbd device and dd on it

Added by Oliver Dzombc over 8 years ago. Updated over 8 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

Redhat 7 64 Bit
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)

Scenario:

Migration of existing LVM's on a remote server to ceph RBD device.

Used commands:

ceph-cluster-node# rbd map virtualdisk

-> /dev/rbd0 created

remote-server# dd if=/dev/lvmgroup/logical_volume bs=256 | ssh ceph-cluster-node dd of=/dev/rbd0 bs=256

The transfere is starting. During the transfere a kernel panic can happen ( ~ 60/70% chance )

The processlist have some stale processes:

11460 ? D 0:04 [kworker/u128:1]
18904 ? Ds 0:25 dd of=/dev/rbd0 bs=256M
19768 ? D 0:00 /sbin/blkid -o udev -p /dev/rbd0

They will not vanish. Not if you stop the transfere. Not if you try to kill the process.

The log shows:

[10874.157225] rbd0: unknown partition table
[10874.157282] rbd: rbd0: added with size 0x2580000000
[11394.608550] INFO: task kworker/u128:1:11460 blocked for more than 120 seconds.
[11394.608590] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11394.608625] kworker/u128:1 D ffff88040b413680 0 11460 2 0x00000080
[11394.608633] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
[11394.608635] ffff8801713af788 0000000000000046 ffff8800e9de5b00 ffff8801713affd8
[11394.608638] ffff8801713affd8 ffff8801713affd8 ffff8800e9de5b00 ffff88040b413f48
[11394.608640] ffff880036e5a760 ffff8803dec0fc30 ffff880036e5a790 ffff880036e5a760
[11394.608642] Call Trace:
[11394.608648] [<ffffffff81609a2d>] io_schedule+0x9d/0x130
[11394.608652] [<ffffffff812ac665>] get_request+0x1b5/0x780
[11394.608656] [<ffffffff81098230>] ? wake_up_bit+0x30/0x30
[11394.608660] [<ffffffff812ae946>] blk_queue_bio+0xc6/0x390
[11394.608662] [<ffffffff812aa7a2>] generic_make_request+0xe2/0x130
[11394.608665] [<ffffffff812aa861>] submit_bio+0x71/0x150
[11394.608668] [<ffffffff811febae>] ? bio_alloc_bioset+0x1be/0x2e0
[11394.608671] [<ffffffff810a2384>] ? __wake_up+0x44/0x50
[11394.608674] [<ffffffff811f9bd3>] _submit_bh+0x143/0x210
[11394.608676] [<ffffffff811fc822>] __block_write_full_page+0x162/0x380
[11394.608679] [<ffffffff811ffe70>] ? I_BDEV+0x10/0x10
[11394.608682] [<ffffffff811ffe70>] ? I_BDEV+0x10/0x10
[11394.608684] [<ffffffff811fcc0b>] block_write_full_page_endio+0xeb/0x100
[11394.608686] [<ffffffff811fcc35>] block_write_full_page+0x15/0x20
[11394.608689] [<ffffffff812005c8>] blkdev_writepage+0x18/0x20
[11394.608693] [<ffffffff81160f83>] __writepage+0x13/0x50
[11394.608696] [<ffffffff81161aa1>] write_cache_pages+0x251/0x4d0
[11394.608699] [<ffffffff81160f70>] ? global_dirtyable_memory+0x70/0x70
[11394.608703] [<ffffffff81161d6d>] generic_writepages+0x4d/0x80
[11394.608706] [<ffffffff81162e1e>] do_writepages+0x1e/0x40
[11394.608709] [<ffffffff811f0220>] __writeback_single_inode+0x40/0x220
[11394.608711] [<ffffffff811f0f1e>] writeback_sb_inodes+0x25e/0x420
[11394.608714] [<ffffffff811f117f>] __writeback_inodes_wb+0x9f/0xd0
[11394.608716] [<ffffffff811f19c3>] wb_writeback+0x263/0x2f0
[11394.608719] [<ffffffff811f2ffb>] bdi_writeback_workfn+0x2cb/0x460
[11394.608723] [<ffffffff8108f0bb>] process_one_work+0x17b/0x470
[11394.608725] [<ffffffff8108fe8b>] worker_thread+0x11b/0x400
[11394.608728] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400
[11394.608730] [<ffffffff8109726f>] kthread+0xcf/0xe0
[11394.608732] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[11394.608736] [<ffffffff816141d8>] ret_from_fork+0x58/0x90
[11394.608738] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[11514.544195] INFO: task kworker/u128:1:11460 blocked for more than 120 seconds.
[11514.544233] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11514.544268] kworker/u128:1 D ffff88040b413680 0 11460 2 0x00000080
[11514.544276] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
[11514.544278] ffff8801713af788 0000000000000046 ffff8800e9de5b00 ffff8801713affd8
[11514.544281] ffff8801713affd8 ffff8801713affd8 ffff8800e9de5b00 ffff88040b413f48
[11514.544283] ffff880036e5a760 ffff8803dec0fc30 ffff880036e5a790 ffff880036e5a760
[11514.544286] Call Trace:
[11514.544292] [<ffffffff81609a2d>] io_schedule+0x9d/0x130
[11514.544296] [<ffffffff812ac665>] get_request+0x1b5/0x780
[11514.544300] [<ffffffff81098230>] ? wake_up_bit+0x30/0x30
[11514.544303] [<ffffffff812ae946>] blk_queue_bio+0xc6/0x390
[11514.544306] [<ffffffff812aa7a2>] generic_make_request+0xe2/0x130
[11514.544308] [<ffffffff812aa861>] submit_bio+0x71/0x150
[11514.544311] [<ffffffff811febae>] ? bio_alloc_bioset+0x1be/0x2e0
[11514.544315] [<ffffffff810a2384>] ? __wake_up+0x44/0x50
[11514.544317] [<ffffffff811f9bd3>] _submit_bh+0x143/0x210
[11514.544320] [<ffffffff811fc822>] __block_write_full_page+0x162/0x380
[11514.544322] [<ffffffff811ffe70>] ? I_BDEV+0x10/0x10
[11514.544325] [<ffffffff811ffe70>] ? I_BDEV+0x10/0x10
[11514.544327] [<ffffffff811fcc0b>] block_write_full_page_endio+0xeb/0x100
[11514.544330] [<ffffffff811fcc35>] block_write_full_page+0x15/0x20
[11514.544332] [<ffffffff812005c8>] blkdev_writepage+0x18/0x20
[11514.544336] [<ffffffff81160f83>] __writepage+0x13/0x50
[11514.544339] [<ffffffff81161aa1>] write_cache_pages+0x251/0x4d0
[11514.544342] [<ffffffff81160f70>] ? global_dirtyable_memory+0x70/0x70
[11514.544346] [<ffffffff81161d6d>] generic_writepages+0x4d/0x80
[11514.544349] [<ffffffff81162e1e>] do_writepages+0x1e/0x40
[11514.544352] [<ffffffff811f0220>] __writeback_single_inode+0x40/0x220
[11514.544354] [<ffffffff811f0f1e>] writeback_sb_inodes+0x25e/0x420
[11514.544357] [<ffffffff811f117f>] __writeback_inodes_wb+0x9f/0xd0
[11514.544360] [<ffffffff811f19c3>] wb_writeback+0x263/0x2f0
[11514.544362] [<ffffffff811f2ffb>] bdi_writeback_workfn+0x2cb/0x460
[11514.544366] [<ffffffff8108f0bb>] process_one_work+0x17b/0x470
[11514.544368] [<ffffffff8108fe8b>] worker_thread+0x11b/0x400
[11514.544371] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400
[11514.544373] [<ffffffff8109726f>] kthread+0xcf/0xe0
[11514.544376] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[11514.544379] [<ffffffff816141d8>] ret_from_fork+0x58/0x90
[11514.544381] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[11634.483315] INFO: task kworker/u128:1:11460 blocked for more than 120 seconds.
[11634.483354] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11634.483389] kworker/u128:1 D ffff88040b413680 0 11460 2 0x00000080
[11634.483397] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
[11634.483400] ffff8801713af788 0000000000000046 ffff8800e9de5b00 ffff8801713affd8
[11634.483402] ffff8801713affd8 ffff8801713affd8 ffff8800e9de5b00 ffff88040b413f48
[11634.483405] ffff880036e5a760 ffff8803dec0fc30 ffff880036e5a790 ffff880036e5a760
[11634.483407] Call Trace:
[11634.483413] [<ffffffff81609a2d>] io_schedule+0x9d/0x130
[11634.483417] [<ffffffff812ac665>] get_request+0x1b5/0x780
[11634.483421] [<ffffffff81098230>] ? wake_up_bit+0x30/0x30
[11634.483424] [<ffffffff812ae946>] blk_queue_bio+0xc6/0x390
[11634.483427] [<ffffffff812aa7a2>] generic_make_request+0xe2/0x130
[11634.483430] [<ffffffff812aa861>] submit_bio+0x71/0x150
[11634.483433] [<ffffffff811febae>] ? bio_alloc_bioset+0x1be/0x2e0
[11634.483436] [<ffffffff810a2384>] ? __wake_up+0x44/0x50
[11634.483438] [<ffffffff811f9bd3>] _submit_bh+0x143/0x210
[11634.483441] [<ffffffff811fc822>] __block_write_full_page+0x162/0x380
[11634.483444] [<ffffffff811ffe70>] ? I_BDEV+0x10/0x10
[11634.483446] [<ffffffff811ffe70>] ? I_BDEV+0x10/0x10
[11634.483449] [<ffffffff811fcc0b>] block_write_full_page_endio+0xeb/0x100
[11634.483451] [<ffffffff811fcc35>] block_write_full_page+0x15/0x20
[11634.483454] [<ffffffff812005c8>] blkdev_writepage+0x18/0x20
[11634.483458] [<ffffffff81160f83>] __writepage+0x13/0x50
[11634.483461] [<ffffffff81161aa1>] write_cache_pages+0x251/0x4d0
[11634.483464] [<ffffffff81160f70>] ? global_dirtyable_memory+0x70/0x70
[11634.483467] [<ffffffff81161d6d>] generic_writepages+0x4d/0x80
[11634.483470] [<ffffffff81162e1e>] do_writepages+0x1e/0x40
[11634.483473] [<ffffffff811f0220>] __writeback_single_inode+0x40/0x220
[11634.483476] [<ffffffff811f0f1e>] writeback_sb_inodes+0x25e/0x420
[11634.483479] [<ffffffff811f117f>] __writeback_inodes_wb+0x9f/0xd0
[11634.483481] [<ffffffff811f19c3>] wb_writeback+0x263/0x2f0
[11634.483484] [<ffffffff811f2ffb>] bdi_writeback_workfn+0x2cb/0x460
[11634.483487] [<ffffffff8108f0bb>] process_one_work+0x17b/0x470
[11634.483490] [<ffffffff8108fe8b>] worker_thread+0x11b/0x400
[11634.483492] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400
[11634.483494] [<ffffffff8109726f>] kthread+0xcf/0xe0
[11634.483497] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[11634.483501] [<ffffffff816141d8>] ret_from_fork+0x58/0x90
[11634.483503] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[11754.422042] INFO: task kworker/u128:1:11460 blocked for more than 120 seconds.
[11754.422082] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11754.422117] kworker/u128:1 D ffff88040b413680 0 11460 2 0x00000080
[11754.422125] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
[11754.422127] ffff8801713af788 0000000000000046 ffff8800e9de5b00 ffff8801713affd8
[11754.422130] ffff8801713affd8 ffff8801713affd8 ffff8800e9de5b00 ffff88040b413f48
[11754.422132] ffff880036e5a760 ffff8803dec0fc30 ffff880036e5a790 ffff880036e5a760
[11754.422135] Call Trace:
[11754.422141] [<ffffffff81609a2d>] io_schedule+0x9d/0x130
[11754.422145] [<ffffffff812ac665>] get_request+0x1b5/0x780
[11754.422148] [<ffffffff81098230>] ? wake_up_bit+0x30/0x30
[11754.422152] [<ffffffff812ae946>] blk_queue_bio+0xc6/0x390
[11754.422155] [<ffffffff812aa7a2>] generic_make_request+0xe2/0x130
[11754.422157] [<ffffffff812aa861>] submit_bio+0x71/0x150
[11754.422160] [<ffffffff811febae>] ? bio_alloc_bioset+0x1be/0x2e0
[11754.422164] [<ffffffff810a2384>] ? __wake_up+0x44/0x50
[11754.422166] [<ffffffff811f9bd3>] _submit_bh+0x143/0x210
[11754.422169] [<ffffffff811fc822>] __block_write_full_page+0x162/0x380
[11754.422172] [<ffffffff811ffe70>] ? I_BDEV+0x10/0x10
[11754.422174] [<ffffffff811ffe70>] ? I_BDEV+0x10/0x10
[11754.422176] [<ffffffff811fcc0b>] block_write_full_page_endio+0xeb/0x100
[11754.422179] [<ffffffff811fcc35>] block_write_full_page+0x15/0x20
[11754.422181] [<ffffffff812005c8>] blkdev_writepage+0x18/0x20
[11754.422185] [<ffffffff81160f83>] __writepage+0x13/0x50
[11754.422189] [<ffffffff81161aa1>] write_cache_pages+0x251/0x4d0
[11754.422192] [<ffffffff81160f70>] ? global_dirtyable_memory+0x70/0x70
[11754.422195] [<ffffffff81161d6d>] generic_writepages+0x4d/0x80
[11754.422198] [<ffffffff81162e1e>] do_writepages+0x1e/0x40
[11754.422201] [<ffffffff811f0220>] __writeback_single_inode+0x40/0x220
[11754.422203] [<ffffffff811f0f1e>] writeback_sb_inodes+0x25e/0x420
[11754.422206] [<ffffffff811f117f>] __writeback_inodes_wb+0x9f/0xd0
[11754.422209] [<ffffffff811f19c3>] wb_writeback+0x263/0x2f0
[11754.422211] [<ffffffff811f2ffb>] bdi_writeback_workfn+0x2cb/0x460
[11754.422215] [<ffffffff8108f0bb>] process_one_work+0x17b/0x470
[11754.422218] [<ffffffff8108fe8b>] worker_thread+0x11b/0x400
[11754.422220] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400
[11754.422222] [<ffffffff8109726f>] kthread+0xcf/0xe0
[11754.422225] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140
[11754.422228] [<ffffffff816141d8>] ret_from_fork+0x58/0x90
[11754.422230] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140

Actions #1

Updated by Loïc Dachary over 8 years ago

  • Project changed from Ceph to rbd
Actions #2

Updated by Josh Durgin over 8 years ago

  • Project changed from rbd to Linux kernel client

What does 'ceph -s' show for your cluster?

Actions #3

Updated by Oliver Dzombc over 8 years ago

Josh Durgin wrote:

What does 'ceph -s' show for your cluster?

All is fine:

cluster &lt;id&gt;
health HEALTH_WARN
nodeep-scrub flag(s) set
monmap e1: 4 mons at {ceph1=10.0.0.1:6789/0,ceph2=10.0.0.2:6789/0,ceph3=10.0.0.3:6789/0,ceph4=10.0.0.4:6789/0}
election epoch 862, quorum 0,1,2,3 ceph1,ceph2,ceph3,ceph4
osdmap e1791: 12 osds: 12 up, 12 in
flags nodeep-scrub
pgmap v4724631: 250 pgs, 1 pools, 6249 GB data, 1573 kobjects
12507 GB used, 53599 GB / 66106 GB avail
250 active+clean
client io 3204 kB/s rd, 10055 kB/s wr, 631 op/s

We set the nodeep-scrub because we had some issues with high IO usage while deep scrubbing.

The other logs shows nothing intresting.

Actions #4

Updated by Josh Durgin over 8 years ago

Is this running on the same host as osds? If so, there's a known deadlock potential under memory pressure (due to a kernel client depending on a userspace server on the same box).

When the hang occurs, can you see what requests are outstanding:

mount -t debugfs none /sys/kernel/debug
cat /sys/kernel/debug/ceph/*/osdc

This will show which osds the requests are going to. If any seem to be stuck, you can investigate on the osd side with commands like:

ceph daemon osd.1 dump_ops_in_flight
ceph daemon osd.1 dump_historic_ops
Actions #5

Updated by Oliver Dzombc over 8 years ago

hi,

ok so its a known bug.

ceph daemon osd.0 dump_ops_in_flight {
"ops": [],
"num_ops": 0
}

or

admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

The 2nd command brings a veery long list ( the system is productive ).

So i will not transfere directly to the ceph cluster but have to mount it external that i can transfere it.

Too bad, that will make it much slower.

Thank you !

Greetings
Oliver

Actions

Also available in: Atom PDF