Actions
Bug #17153
closedkernel hung task warnings on teuthology.front kernel
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
teuthology.front.sepia.ceph.com was recently upgraded to the ubuntu distro kernel (4.4.0-34-generic). Since then, I'm seeing some warnings in the ring buffer:
[998531.323457] INFO: task nginx:2786 blocked for more than 120 seconds. [998531.323489] Not tainted 4.4.0-34-generic #53-Ubuntu [998531.323506] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [998531.323529] nginx D ffff880819a4b968 0 2786 2774 0x00000000 [998531.323533] ffff880819a4b968 ffff88081ec00d80 ffff88081bb6e040 ffff880813a26040 [998531.323536] ffff880819a4c000 ffff88081f3d6d00 7fffffffffffffff ffff880819a4bae0 [998531.323538] ffffffff8182a610 ffff880819a4b980 ffffffff81829e15 0000000000000000 [998531.323540] Call Trace: [998531.323564] [<ffffffff8182a610>] ? bit_wait+0x60/0x60 [998531.323567] [<ffffffff81829e15>] schedule+0x35/0x80 [998531.323569] [<ffffffff8182cf35>] schedule_timeout+0x1b5/0x270 [998531.323573] [<ffffffff811eb0a9>] ? get_partial_node.isra.61+0x1c9/0x200 [998531.323577] [<ffffffff8106426e>] ? kvm_clock_get_cycles+0x1e/0x20 [998531.323579] [<ffffffff8182a610>] ? bit_wait+0x60/0x60 [998531.323581] [<ffffffff81829344>] io_schedule_timeout+0xa4/0x110 [998531.323583] [<ffffffff8182a62b>] bit_wait_io+0x1b/0x70 [998531.323585] [<ffffffff8182a3ae>] __wait_on_bit_lock+0x4e/0xb0 [998531.323590] [<ffffffff8118d43b>] __lock_page+0xbb/0xe0 [998531.323594] [<ffffffff810c3ce0>] ? autoremove_wake_function+0x40/0x40 [998531.323599] [<ffffffff8123f981>] __generic_file_splice_read+0x4b1/0x5c0 [998531.323601] [<ffffffff8123e2f0>] ? page_cache_pipe_buf_release+0x20/0x20 [998531.323605] [<ffffffff8179dd93>] ? inet_sendpage+0x73/0xd0 [998531.323607] [<ffffffff8123fe72>] generic_file_splice_read+0x42/0x80 [998531.323609] [<ffffffff8123e779>] do_splice_to+0x69/0x80 [998531.323611] [<ffffffff8123e84a>] splice_direct_to_actor+0xba/0x210 [998531.323613] [<ffffffff8123e200>] ? do_splice_from+0x30/0x30 [998531.323615] [<ffffffff8123ea38>] do_splice_direct+0x98/0xd0 [998531.323618] [<ffffffff8120dd2f>] do_sendfile+0x1bf/0x3a0 [998531.323620] [<ffffffff8120e96e>] SyS_sendfile64+0x5e/0xb0 [998531.323622] [<ffffffff8182def2>] entry_SYSCALL_64_fastpath+0x16/0x71 [998531.323672] INFO: task teuthology:26992 blocked for more than 120 seconds. [998531.323693] Not tainted 4.4.0-34-generic #53-Ubuntu [998531.323710] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [998531.323732] teuthology D ffff8802df4379a8 0 26992 25730 0x00000000 [998531.323734] ffff8802df4379a8 ffff88081549e068 ffff880812fc0000 ffff8805e743d280 [998531.323736] ffff8802df438000 ffff88081f596d00 7fffffffffffffff ffff8802df437b20 [998531.323738] ffffffff8182a610 ffff8802df4379c0 ffffffff81829e15 0000000000000000 [998531.323740] Call Trace: [998531.323743] [<ffffffff8182a610>] ? bit_wait+0x60/0x60 [998531.323745] [<ffffffff81829e15>] schedule+0x35/0x80 [998531.323746] [<ffffffff8182cf35>] schedule_timeout+0x1b5/0x270 [998531.323750] [<ffffffff81405dd5>] ? find_next_bit+0x15/0x20 [998531.323754] [<ffffffff813f0fbf>] ? cpumask_next_and+0x2f/0x40 [998531.323756] [<ffffffff810bcb65>] ? update_sd_lb_stats+0x115/0x520 [998531.323758] [<ffffffff8106426e>] ? kvm_clock_get_cycles+0x1e/0x20 [998531.323760] [<ffffffff8182a610>] ? bit_wait+0x60/0x60 [998531.323762] [<ffffffff81829344>] io_schedule_timeout+0xa4/0x110 [998531.323764] [<ffffffff8182a62b>] bit_wait_io+0x1b/0x70 [998531.323766] [<ffffffff8182a3ae>] __wait_on_bit_lock+0x4e/0xb0 [998531.323768] [<ffffffff8118d43b>] __lock_page+0xbb/0xe0 [998531.323771] [<ffffffff810c3ce0>] ? autoremove_wake_function+0x40/0x40 [998531.323773] [<ffffffff8118e79d>] pagecache_get_page+0x17d/0x1c0 [998531.323788] [<ffffffffc039233a>] ? ceph_pool_perm_check+0x5a/0x700 [ceph] [998531.323791] [<ffffffff8118e806>] grab_cache_page_write_begin+0x26/0x40 [998531.323797] [<ffffffffc0391638>] ceph_write_begin+0x48/0xe0 [ceph] [998531.323799] [<ffffffff8118db4e>] generic_perform_write+0xce/0x1c0 [998531.323803] [<ffffffff812282a9>] ? file_update_time+0xc9/0x110 [998531.323809] [<ffffffffc038c16a>] ceph_write_iter+0xf8a/0x1050 [ceph] [998531.323812] [<ffffffff8122dbc4>] ? mntput+0x24/0x40 [998531.323814] [<ffffffff8121777d>] ? terminate_walk+0xbd/0xd0 [998531.323817] [<ffffffff8121ce11>] ? filename_lookup+0xf1/0x180 [998531.323819] [<ffffffff811ebac7>] ? kmem_cache_alloc+0x187/0x1f0 [998531.323821] [<ffffffff8121c9d6>] ? getname_flags+0x56/0x1f0 [998531.323823] [<ffffffff8120c97b>] new_sync_write+0x9b/0xe0 [998531.323825] [<ffffffff8120c9e6>] __vfs_write+0x26/0x40 [998531.323827] [<ffffffff8120d369>] vfs_write+0xa9/0x1a0 [998531.323828] [<ffffffff8120e025>] SyS_write+0x55/0xc0 [998531.323830] [<ffffffff8182def2>] entry_SYSCALL_64_fastpath+0x16/0x71 [998531.323833] INFO: task teuthology:2191 blocked for more than 120 seconds. [998531.323853] Not tainted 4.4.0-34-generic #53-Ubuntu [998531.323869] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [998531.323892] teuthology D ffff880005c7b9a8 0 2191 6398 0x00000000 [998531.323894] ffff880005c7b9a8 ffff88081549e068 ffff880620cce040 ffff880035c9d280 [998531.323896] ffff880005c7c000 ffff88081f5d6d00 7fffffffffffffff ffff880005c7bb20 [998531.323898] ffffffff8182a610 ffff880005c7b9c0 ffffffff81829e15 0000000000000000 [998531.323900] Call Trace: [998531.323902] [<ffffffff8182a610>] ? bit_wait+0x60/0x60 [998531.323904] [<ffffffff81829e15>] schedule+0x35/0x80 [998531.323906] [<ffffffff8182cf35>] schedule_timeout+0x1b5/0x270 [998531.323908] [<ffffffff81749bf4>] ? sch_direct_xmit+0x74/0x220 [998531.323910] [<ffffffff810ca961>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [998531.323912] [<ffffffff8106426e>] ? kvm_clock_get_cycles+0x1e/0x20 [998531.323914] [<ffffffff8182a610>] ? bit_wait+0x60/0x60 [998531.323916] [<ffffffff81829344>] io_schedule_timeout+0xa4/0x110 [998531.323918] [<ffffffff8182a62b>] bit_wait_io+0x1b/0x70 [998531.323920] [<ffffffff8182a3ae>] __wait_on_bit_lock+0x4e/0xb0 [998531.323922] [<ffffffff8118d43b>] __lock_page+0xbb/0xe0 [998531.323924] [<ffffffff810c3ce0>] ? autoremove_wake_function+0x40/0x40 [998531.323926] [<ffffffff8118e79d>] pagecache_get_page+0x17d/0x1c0 [998531.323933] [<ffffffffc039233a>] ? ceph_pool_perm_check+0x5a/0x700 [ceph] [998531.323935] [<ffffffff8118e806>] grab_cache_page_write_begin+0x26/0x40 [998531.323940] [<ffffffffc0391638>] ceph_write_begin+0x48/0xe0 [ceph] [998531.323943] [<ffffffff8118db4e>] generic_perform_write+0xce/0x1c0 [998531.323945] [<ffffffff812282a9>] ? file_update_time+0xc9/0x110 [998531.323950] [<ffffffffc038c16a>] ceph_write_iter+0xf8a/0x1050 [ceph] [998531.323952] [<ffffffff8122dbc4>] ? mntput+0x24/0x40 [998531.323955] [<ffffffff8121777d>] ? terminate_walk+0xbd/0xd0 [998531.323957] [<ffffffff8121ce11>] ? filename_lookup+0xf1/0x180 [998531.323959] [<ffffffff811ebac7>] ? kmem_cache_alloc+0x187/0x1f0 [998531.323962] [<ffffffff8121c9d6>] ? getname_flags+0x56/0x1f0 [998531.323964] [<ffffffff8120c97b>] new_sync_write+0x9b/0xe0 [998531.323966] [<ffffffff8120c9e6>] __vfs_write+0x26/0x40 [998531.323968] [<ffffffff8120d369>] vfs_write+0xa9/0x1a0 [998531.323970] [<ffffffff8120e025>] SyS_write+0x55/0xc0 [998531.323972] [<ffffffff8182def2>] entry_SYSCALL_64_fastpath+0x16/0x71
I don't have sudo perms on this box, so I can't verify whether the tasks are still hung there, but I see several processes with matching pids that have been running for a very long time.
Actions