Project

General

Profile

Actions

Bug #9071

closed

mkfs.ext4 stuck in D state on RBD with kernel client

Added by Ivan Mironov over 9 years ago. Updated over 9 years ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I tried to create ext4 on newly created and mapped RBD image, but mkfs.ext4 stuck:

  1. mkfs.ext4 /dev/rbd/docker.rbd/docker-1.ext4
    mke2fs 1.41.12 (17-May-2010)
    Filesystem label=
    OS type: Linux
    Block size=4096 (log=2)
    Fragment size=4096 (log=2)
    Stride=1024 blocks, Stripe width=1024 blocks
    16777216 inodes, 67108864 blocks
    3355443 blocks (5.00%) reserved for the super user
    First data block=0
    Maximum filesystem blocks=4294967296
    2048 block groups
    32768 blocks per group, 32768 fragments per group
    8192 inodes per group
    Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424, 20480000, 23887872

Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information:
^C^C^C^C

  1. ps aux | grep D
    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    root 2967 0.0 0.0 0 0 ? D Aug11 0:30 [kworker/2:2]
    root 13778 0.0 0.0 0 0 ? D 01:41 0:06 [kworker/6:2]
    root 14040 0.0 0.0 0 0 ? D 02:06 0:00 [kworker/6:1]
    root 14048 0.0 0.0 0 0 ? D 02:06 0:05 [kworker/10:2]
    root 14060 0.0 0.0 0 0 ? D 02:14 0:04 [kworker/1:0]
    root 14121 0.0 0.0 0 0 ? D 02:16 0:04 [kworker/5:0]
    root 14309 0.0 0.0 0 0 ? D 02:19 0:03 [kworker/7:1]
    root 14312 0.0 0.0 0 0 ? D 02:19 0:00 [kworker/7:3]
    root 14316 0.0 0.0 0 0 ? D 02:19 0:05 [kworker/9:3]
    root 14793 0.0 0.0 0 0 ? D 02:55 0:01 [kworker/4:2]
    root 14799 0.0 0.0 0 0 ? D 02:55 0:02 [kworker/8:2]
    root 29159 0.0 0.0 0 0 ? D 04:11 0:00 [kworker/u24:0]
    root 29688 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/0:0]
    root 29718 0.2 0.0 124160 12760 pts/5 D+ 04:26 0:02 mkfs.ext4 /dev/rbd/docker.rbd/docker-1.ext4
    root 29757 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/7:0]
    root 29758 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/6:0]
    root 29759 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/7:2]
    root 29760 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/2:1]
    root 29761 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/7:4]
    root 29762 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/9:0]
    root 29763 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/7:5]
    root 29765 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/10:1]
    root 29767 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/6:4]
    root 29829 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/2:3]
    root 29964 0.0 0.0 0 0 ? D 04:26 0:00 [kworker/6:5]
    root 30254 0.0 0.0 0 0 ? D 04:36 0:00 [kworker/0:2]
    root 30265 0.0 0.0 0 0 ? D 04:41 0:00 [kworker/7:6]
    root 30266 0.0 0.0 0 0 ? D 04:41 0:00 [kworker/6:6]
    root 30268 0.0 0.0 0 0 ? D 04:41 0:00 [kworker/7:7]
    root 30269 0.0 0.0 0 0 ? D 04:41 0:00 [kworker/7:8]
    root 30270 0.0 0.0 0 0 ? D 04:41 0:00 [kworker/7:9]
    root 30280 0.0 0.0 103300 2036 pts/4 S+ 04:41 0:00 grep D

And here is the part of dmesg:
Aug 12 04:25:50 salt kernel: libceph: client4835 fsid d4fba9f9-3fa8-4f5f-a81a-acc312cb0152
Aug 12 04:25:51 salt kernel: libceph: mon2 10.208.31.5:6789 session established
Aug 12 04:25:51 salt kernel: rbd0: unknown partition table
Aug 12 04:25:51 salt kernel: rbd: rbd0: added with size 0x4000000000
Aug 12 04:30:05 salt kernel: INFO: task kworker/2:2:2967 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/2:2 D 0000000000000002 0 2967 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 12 04:30:05 salt kernel: ffff880008c97b78 0000000000000046 ffff880008c97b28 ffff880008c94010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff880852d8ef30 ffff880857afa150
Aug 12 04:30:05 salt kernel: 0000003500004040 ffff8808100437c0 ffff8808100437c4 ffff880852d8ef30
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff8163588b>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b78df>] get_reply+0x3f/0x200 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7b28>] alloc_msg+0x88/0x90 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af8e1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b0b98>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af268>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b21c6>] try_read+0x2b6/0x430 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b2678>] con_work+0x78/0x220 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: INFO: task kworker/6:2:13778 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/6:2 D 0000000000000006 0 13778 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 12 04:30:05 salt kernel: ffff8807e82e7b78 0000000000000046 ffff8807e82e7b28 ffff8807e82e4010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff880738dfcef0 ffff8808574121d0
Aug 12 04:30:05 salt kernel: 0000003500004040 ffff8808100437c0 ffff8808100437c4 ffff880738dfcef0
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff8163588b>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b78df>] get_reply+0x3f/0x200 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7b28>] alloc_msg+0x88/0x90 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af8e1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b0b98>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af268>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b21c6>] try_read+0x2b6/0x430 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b2678>] con_work+0x78/0x220 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: INFO: task kworker/6:1:14040 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/6:1 D 0000000000000006 0 14040 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 12 04:30:05 salt kernel: ffff88007a393b78 0000000000000046 ffff88007a393b28 ffff88007a390010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff880853468f30 ffff88029d466310
Aug 12 04:30:05 salt kernel: 0000003500004040 ffff8808100437c0 ffff8808100437c4 ffff880853468f30
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff8163588b>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b78df>] get_reply+0x3f/0x200 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7b28>] alloc_msg+0x88/0x90 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af8e1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b0b98>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af268>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b21c6>] try_read+0x2b6/0x430 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff8108542b>] ? start_worker+0x2b/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b2678>] con_work+0x78/0x220 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: INFO: task kworker/10:2:14048 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/10:2 D 000000000000000a 0 14048 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 12 04:30:05 salt kernel: ffff880108993b78 0000000000000046 ffff880108993b28 ffff880108990010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff88085411e0d0 ffff880857426250
Aug 12 04:30:05 salt kernel: 0000003500004040 ffff8808100437c0 ffff8808100437c4 ffff88085411e0d0
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff8163588b>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b78df>] get_reply+0x3f/0x200 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7b28>] alloc_msg+0x88/0x90 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af8e1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b0b98>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af268>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b21c6>] try_read+0x2b6/0x430 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b2678>] con_work+0x78/0x220 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: INFO: task kworker/7:1:14309 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/7:1 D 0000000000000007 0 14309 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 12 04:30:05 salt kernel: ffff880819413b78 0000000000000046 ffff880819413b28 ffff880819410010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff88037e1a50b0 ffff88085741aff0
Aug 12 04:30:05 salt kernel: 0000003500004040 ffff8808100437c0 ffff8808100437c4 ffff88037e1a50b0
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff8163588b>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b78df>] get_reply+0x3f/0x200 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7b28>] alloc_msg+0x88/0x90 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af8e1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b0b98>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af268>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b21c6>] try_read+0x2b6/0x430 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b2678>] con_work+0x78/0x220 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: INFO: task kworker/7:3:14312 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/7:3 D 0000000000000007 0 14312 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 12 04:30:05 salt kernel: ffff88010836bb78 0000000000000046 ffff88010836bb28 ffff880108368010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff880852d8e150 ffff88035950c2d0
Aug 12 04:30:05 salt kernel: 0000003500004040 ffff8808100437c0 ffff8808100437c4 ffff880852d8e150
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff8163588b>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b78df>] get_reply+0x3f/0x200 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7b28>] alloc_msg+0x88/0x90 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af8e1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b0b98>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af268>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b21c6>] try_read+0x2b6/0x430 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b2678>] con_work+0x78/0x220 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: INFO: task kworker/9:3:14316 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/9:3 D 0000000000000009 0 14316 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 12 04:30:05 salt kernel: ffff88011252bb78 0000000000000046 ffff88011252bb28 ffff880112528010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff880452ff8f30 ffff880857427030
Aug 12 04:30:05 salt kernel: 0000003500004040 ffff8808100437c0 ffff8808100437c4 ffff880452ff8f30
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff8163588b>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b78df>] get_reply+0x3f/0x200 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7b28>] alloc_msg+0x88/0x90 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af8e1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b0b98>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af268>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b21c6>] try_read+0x2b6/0x430 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b2678>] con_work+0x78/0x220 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: INFO: task kworker/u24:0:29159 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/u24:0 D 0000000000000004 0 29159 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: writeback bdi_writeback_workfn (flush-252:0)
Aug 12 04:30:05 salt kernel: ffff88011c62f638 0000000000000046 ffff88011c62f648 ffff88011c62c010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff8808534bf030 ffff880857402190
Aug 12 04:30:05 salt kernel: ffff88011c62f608 ffff88087fc944c0 ffff8808534bf030 0000000000000001
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633e7c>] io_schedule+0x8c/0xd0
Aug 12 04:30:05 salt kernel: [<ffffffff812a6589>] get_request+0x169/0x350
Aug 12 04:30:05 salt kernel: [<ffffffff812ace74>] ? ll_back_merge_fn+0xb4/0x190
Aug 12 04:30:05 salt kernel: [<ffffffff810b0950>] ? bit_waitqueue+0xe0/0xe0
Aug 12 04:30:05 salt kernel: [<ffffffff812a0dcb>] ? elv_merge+0xeb/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff812a67f6>] blk_queue_bio+0x86/0x350
Aug 12 04:30:05 salt kernel: [<ffffffff812a54a0>] generic_make_request+0xc0/0x100
Aug 12 04:30:05 salt kernel: [<ffffffff812a5562>] submit_bio+0x82/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff811f6a91>] ? bio_alloc_bioset+0xa1/0x1e0
Aug 12 04:30:05 salt kernel: [<ffffffff811f1136>] _submit_bh+0x146/0x220
Aug 12 04:30:05 salt kernel: [<ffffffff811f1220>] submit_bh+0x10/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff811f4b58>] __block_write_full_page+0x1a8/0x340
Aug 12 04:30:05 salt kernel: [<ffffffff811f2110>] ? touch_buffer+0x60/0x60
Aug 12 04:30:05 salt kernel: [<ffffffff811f7ec0>] ? I_BDEV+0x10/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff811f7ec0>] ? I_BDEV+0x10/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff811f2110>] ? touch_buffer+0x60/0x60
Aug 12 04:30:05 salt kernel: [<ffffffff811f4dbd>] block_write_full_page_endio+0xcd/0x110
Aug 12 04:30:05 salt kernel: [<ffffffff811f4e15>] block_write_full_page+0x15/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff811f8fa8>] blkdev_writepage+0x18/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff81156667>] __writepage+0x17/0x50
Aug 12 04:30:05 salt kernel: [<ffffffff81157a64>] write_cache_pages+0x244/0x510
Aug 12 04:30:05 salt kernel: [<ffffffff81156650>] ? set_page_dirty+0x60/0x60
Aug 12 04:30:05 salt kernel: [<ffffffff81157d81>] generic_writepages+0x51/0x80
Aug 12 04:30:05 salt kernel: [<ffffffff81157dd0>] do_writepages+0x20/0x40
Aug 12 04:30:05 salt kernel: [<ffffffff811e80c9>] __writeback_single_inode+0x49/0x230
Aug 12 04:30:05 salt kernel: [<ffffffff810b0d8f>] ? wake_up_bit+0x2f/0x40
Aug 12 04:30:05 salt kernel: [<ffffffff811e8f09>] writeback_sb_inodes+0x279/0x390
Aug 12 04:30:05 salt kernel: [<ffffffff811c1ec5>] ? put_super+0x25/0x40
Aug 12 04:30:05 salt kernel: [<ffffffff811e90be>] __writeback_inodes_wb+0x9e/0xd0
Aug 12 04:30:05 salt kernel: [<ffffffff811e92eb>] wb_writeback+0x1fb/0x2c0
Aug 12 04:30:05 salt kernel: [<ffffffff811e94b0>] wb_do_writeback+0x100/0x1f0
Aug 12 04:30:05 salt kernel: [<ffffffff811e9820>] bdi_writeback_workfn+0x70/0x210
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: INFO: task mkfs.ext4:29718 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: mkfs.ext4 D ffffffff8181d920 0 29718 13448 0x00000080
Aug 12 04:30:05 salt kernel: ffff8801004a75a8 0000000000000086 000000003f958faa ffff8801004a4010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff8806672e10f0 ffff8808577f91b0
Aug 12 04:30:05 salt kernel: ffff8801004a4010 ffff88010321d1b8 ffff8803dd92a390 ffff8806672e10f0
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff81635972>] __mutex_lock_slowpath+0x1c2/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04aec1d>] ceph_con_send+0x4d/0x150 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7694>] __send_queued+0x134/0x180 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b894b>] __ceph_osdc_start_request+0x5b/0xb0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b89f1>] ceph_osdc_start_request+0x51/0x80 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04db0c0>] rbd_img_obj_request_submit+0xb0/0x110 [rbd]
Aug 12 04:30:05 salt kernel: [<ffffffffa04db169>] rbd_img_request_submit+0x49/0x60 [rbd]
Aug 12 04:30:05 salt kernel: [<ffffffffa04dbb08>] rbd_request_fn+0x258/0x2c0 [rbd]
Aug 12 04:30:05 salt kernel: [<ffffffff812a3c07>] __blk_run_queue+0x37/0x50
Aug 12 04:30:05 salt kernel: [<ffffffff812c725c>] cfq_rq_enqueued+0x18c/0x330
Aug 12 04:30:05 salt kernel: [<ffffffff812c752d>] cfq_insert_request+0x12d/0x260
Aug 12 04:30:05 salt kernel: [<ffffffff812a0f97>] __elv_add_request+0x1c7/0x290
Aug 12 04:30:05 salt kernel: [<ffffffff812a4e56>] blk_flush_plug_list+0x1b6/0x220
Aug 12 04:30:05 salt kernel: [<ffffffff81633e65>] io_schedule+0x75/0xd0
Aug 12 04:30:05 salt kernel: [<ffffffff812a6589>] get_request+0x169/0x350
Aug 12 04:30:05 salt kernel: [<ffffffff812ace74>] ? ll_back_merge_fn+0xb4/0x190
Aug 12 04:30:05 salt kernel: [<ffffffff810b0950>] ? bit_waitqueue+0xe0/0xe0
Aug 12 04:30:05 salt kernel: [<ffffffff812a0dcb>] ? elv_merge+0xeb/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff812a67f6>] blk_queue_bio+0x86/0x350
Aug 12 04:30:05 salt kernel: [<ffffffff812a54a0>] generic_make_request+0xc0/0x100
Aug 12 04:30:05 salt kernel: [<ffffffff812a5562>] submit_bio+0x82/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff811f6a91>] ? bio_alloc_bioset+0xa1/0x1e0
Aug 12 04:30:05 salt kernel: [<ffffffff811f1136>] _submit_bh+0x146/0x220
Aug 12 04:30:05 salt kernel: [<ffffffff811f1220>] submit_bh+0x10/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff811f4b58>] __block_write_full_page+0x1a8/0x340
Aug 12 04:30:05 salt kernel: [<ffffffff811f2110>] ? touch_buffer+0x60/0x60
Aug 12 04:30:05 salt kernel: [<ffffffff811f7ec0>] ? I_BDEV+0x10/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff811f7ec0>] ? I_BDEV+0x10/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff811f2110>] ? touch_buffer+0x60/0x60
Aug 12 04:30:05 salt kernel: [<ffffffff811f4dbd>] block_write_full_page_endio+0xcd/0x110
Aug 12 04:30:05 salt kernel: [<ffffffff811f4e15>] block_write_full_page+0x15/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff811f8fa8>] blkdev_writepage+0x18/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff81156667>] __writepage+0x17/0x50
Aug 12 04:30:05 salt kernel: [<ffffffff81157a64>] write_cache_pages+0x244/0x510
Aug 12 04:30:05 salt kernel: [<ffffffff81156650>] ? set_page_dirty+0x60/0x60
Aug 12 04:30:05 salt kernel: [<ffffffff81157d81>] generic_writepages+0x51/0x80
Aug 12 04:30:05 salt kernel: [<ffffffff81157dd0>] do_writepages+0x20/0x40
Aug 12 04:30:05 salt kernel: [<ffffffff8114b8f9>] __filemap_fdatawrite_range+0x59/0x60
Aug 12 04:30:05 salt kernel: [<ffffffff8114b9aa>] filemap_write_and_wait_range+0xaa/0x100
Aug 12 04:30:05 salt kernel: [<ffffffff811f91d5>] blkdev_fsync+0x25/0x60
Aug 12 04:30:05 salt kernel: [<ffffffff811ef13e>] vfs_fsync_range+0x1e/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff811ef15c>] vfs_fsync+0x1c/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff811ef34d>] do_fsync+0x3d/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff811ef3b0>] SyS_fsync+0x10/0x20
Aug 12 04:30:05 salt kernel: [<ffffffff81640329>] system_call_fastpath+0x16/0x1b
Aug 12 04:30:05 salt kernel: INFO: task kworker/6:0:29758 blocked for more than 120 seconds.
Aug 12 04:30:05 salt kernel: Not tainted 3.15.8-1.el6.elrepo.x86_64 #1
Aug 12 04:30:05 salt kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 04:30:05 salt kernel: kworker/6:0 D 0000000000000006 0 29758 2 0x00000080
Aug 12 04:30:05 salt kernel: Workqueue: ceph-msgr con_work [libceph]
Aug 12 04:30:05 salt kernel: ffff8802f0b03b78 0000000000000046 ffff8802f0b03b28 ffff8802f0b00010
Aug 12 04:30:05 salt kernel: 00000000000144c0 00000000000144c0 ffff8808265dc250 ffff8808574121d0
Aug 12 04:30:05 salt kernel: 0000003500004040 ffff8808100437c0 ffff8808100437c4 ffff8808265dc250
Aug 12 04:30:05 salt kernel: Call Trace:
Aug 12 04:30:05 salt kernel: [<ffffffff81633da9>] schedule+0x29/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff81633f0e>] schedule_preempt_disabled+0xe/0x10
Aug 12 04:30:05 salt kernel: [<ffffffff8163588b>] __mutex_lock_slowpath+0xdb/0x1d0
Aug 12 04:30:05 salt kernel: [<ffffffff816359a3>] mutex_lock+0x23/0x40
Aug 12 04:30:05 salt kernel: [<ffffffffa04b78df>] get_reply+0x3f/0x200 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b7b28>] alloc_msg+0x88/0x90 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af8e1>] ceph_con_in_msg_alloc+0x71/0x240 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b0b98>] read_partial_message+0x1e8/0x3d0 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04af268>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b21c6>] try_read+0x2b6/0x430 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffffa04b2678>] con_work+0x78/0x220 [libceph]
Aug 12 04:30:05 salt kernel: [<ffffffff810881ec>] process_one_work+0x17c/0x420
Aug 12 04:30:05 salt kernel: [<ffffffff81089653>] worker_thread+0x123/0x400
Aug 12 04:30:05 salt kernel: [<ffffffff81089530>] ? manage_workers+0x170/0x170
Aug 12 04:30:05 salt kernel: [<ffffffff8108f12e>] kthread+0xce/0xf0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70
Aug 12 04:30:05 salt kernel: [<ffffffff8164027c>] ret_from_fork+0x7c/0xb0
Aug 12 04:30:05 salt kernel: [<ffffffff8108f060>] ? kthread_freezable_should_stop+0x70/0x70

I'm running ceph 0.80.5 under CentOS 6 with kernel 3.15.8 (kernel-ml).

And while RBD is dead, CephFS mounted with kernel client on the same host is still accessible.

Actions #1

Updated by Ivan Mironov over 9 years ago

Reproducible on all my ceph hosts (all with the same kernel), with any image format (1 or 2). But only with mkfs.ext4 (I also tried mkfs.xfs).

Actions #2

Updated by Ivan Mironov over 9 years ago

Please, mark this issue as duplicate of http://tracker.ceph.com/issues/8818

Actions #3

Updated by Sage Weil over 9 years ago

  • Status changed from New to Duplicate

This is a bug in 3.15; it is not present in 3.14. The fix will make it into the next stable 3.15 release soon.

Actions

Also available in: Atom PDF