Project

General

Profile

Bug #19309

IO Hang on raw rbd device using libceph kernel module

Added by Sergey Jerusalimov 5 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
libceph
Target version:
-
Start date:
03/19/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rbd
Release:
Needs Doc:
No

Description

Ilya, Hi!

We have a periodical problem with ceph rbd kern module.
IO stuck, and our working process put in D state.

If i do "cat /sys/kernel/debug/ceph/*/osdc" it's totaly hungs too.

Using: 3.18.43-40 kernel
All osd works normal.
rbd map/mount host and ceph-osd/monitor hosts are different nodes.

I put in attach a full trace of kernel, please help Us :)

full_kern_trace.txt View (898 KB) Sergey Jerusalimov, 03/19/2017 09:42 PM

History

#1 Updated by Sergey Jerusalimov 5 months ago

Interesting is:

Mar 19 22:18:00 kikimr0056 kernel: [288420.742789] Workqueue: events handle_osds_timeout [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.742790] ffff882025e9bce8 0000000000000046 00000be375dc3afe ffff88202616d3c0
Mar 19 22:18:00 kikimr0056 kernel: [288420.742792] 0000000000012b00 ffff88207fc92b70 ffff882025e9bfd8 0000000000012b00
Mar 19 22:18:00 kikimr0056 kernel: [288420.742793] ffff8810291629e0 ffff88202616d3c0 ffff882025e9bd18 ffff882028a84798
Mar 19 22:18:00 kikimr0056 kernel: [288420.742795] Call Trace:
Mar 19 22:18:00 kikimr0056 kernel: [288420.742797] [<ffffffff816dd629>] schedule+0x29/0x70
Mar 19 22:18:00 kikimr0056 kernel: [288420.742798] [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
Mar 19 22:18:00 kikimr0056 kernel: [288420.742800] [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
Mar 19 22:18:00 kikimr0056 kernel: [288420.742801] [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
Mar 19 22:18:00 kikimr0056 kernel: [288420.742804] [<ffffffffa047e7b5>] handle_osds_timeout+0x65/0x120 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.742806] [<ffffffff81084c19>] process_one_work+0x159/0x4f0
Mar 19 22:18:00 kikimr0056 kernel: [288420.742807] [<ffffffff8108561b>] worker_thread+0x11b/0x530
Mar 19 22:18:00 kikimr0056 kernel: [288420.742809] [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
Mar 19 22:18:00 kikimr0056 kernel: [288420.742811] [<ffffffff8108b6f9>] kthread+0xc9/0xe0
Mar 19 22:18:00 kikimr0056 kernel: [288420.742812] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.742814] [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.742816] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

Mar 19 22:18:00 kikimr0056 kernel: [288420.751022] Workqueue: rbd rbd_request_workfn [rbd]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751023] ffff880047a83b38 0000000000000046 ffff881025350c00 ffff8800383fa9e0
Mar 19 22:18:00 kikimr0056 kernel: [288420.751025] 0000000000012b00 0000000000000000 ffff880047a83fd8 0000000000012b00
Mar 19 22:18:00 kikimr0056 kernel: [288420.751026] ffff88014b638860 ffff8800383fa9e0 ffff880047a83b38 ffff8810878dc1b8
Mar 19 22:18:00 kikimr0056 kernel: [288420.751028] Call Trace:
Mar 19 22:18:00 kikimr0056 kernel: [288420.751030] [<ffffffff816dd629>] schedule+0x29/0x70
Mar 19 22:18:00 kikimr0056 kernel: [288420.751031] [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
Mar 19 22:18:00 kikimr0056 kernel: [288420.751033] [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
Mar 19 22:18:00 kikimr0056 kernel: [288420.751036] [<ffffffffa048ad66>] ? ceph_str_hash+0x26/0x80 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751037] [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
Mar 19 22:18:00 kikimr0056 kernel: [288420.751040] [<ffffffffa04784fd>] ceph_con_send+0x4d/0x130 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751043] [<ffffffffa047d3f0>] __send_queued+0x120/0x150 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751046] [<ffffffffa047fe7b>] __ceph_osdc_start_request+0x5b/0xd0 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751048] [<ffffffffa047ff41>] ceph_osdc_start_request+0x51/0x80 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751050] [<ffffffffa04a8050>] rbd_obj_request_submit.isra.27+0x10/0x20 [rbd]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751052] [<ffffffffa04aa6de>] rbd_img_obj_request_submit+0x23e/0x500 [rbd]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751054] [<ffffffffa04aa9ec>] rbd_img_request_submit+0x4c/0x60 [rbd]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751056] [<ffffffffa04ab3d5>] rbd_request_workfn+0x305/0x410 [rbd]
Mar 19 22:18:00 kikimr0056 kernel: [288420.751058] [<ffffffff81084c19>] process_one_work+0x159/0x4f0
Mar 19 22:18:00 kikimr0056 kernel: [288420.751059] [<ffffffff8108561b>] worker_thread+0x11b/0x530
Mar 19 22:18:00 kikimr0056 kernel: [288420.751061] [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
Mar 19 22:18:00 kikimr0056 kernel: [288420.751063] [<ffffffff8108b6f9>] kthread+0xc9/0xe0
Mar 19 22:18:00 kikimr0056 kernel: [288420.751065] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.751066] [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.751068] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

Mar 19 22:18:00 kikimr0056 kernel: [288420.758296] Workqueue: ceph-msgr con_work [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758297] ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
Mar 19 22:18:00 kikimr0056 kernel: [288420.758298] 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
Mar 19 22:18:00 kikimr0056 kernel: [288420.758300] ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
Mar 19 22:18:00 kikimr0056 kernel: [288420.758301] Call Trace:
Mar 19 22:18:00 kikimr0056 kernel: [288420.758303] [<ffffffff816dd629>] schedule+0x29/0x70
Mar 19 22:18:00 kikimr0056 kernel: [288420.758305] [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
Mar 19 22:18:00 kikimr0056 kernel: [288420.758307] [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
Mar 19 22:18:00 kikimr0056 kernel: [288420.758309] [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
Mar 19 22:18:00 kikimr0056 kernel: [288420.758311] [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
Mar 19 22:18:00 kikimr0056 kernel: [288420.758312] [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
Mar 19 22:18:00 kikimr0056 kernel: [288420.758314] [<ffffffff81086335>] flush_work+0x165/0x250
Mar 19 22:18:00 kikimr0056 kernel: [288420.758315] [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758324] [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758326] [<ffffffff816d6b42>] ? __slab_free+0xee/0x234
Mar 19 22:18:00 kikimr0056 kernel: [288420.758334] [<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758336] [<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
Mar 19 22:18:00 kikimr0056 kernel: [288420.758345] [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758352] [<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758360] [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758368] [<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758370] [<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
Mar 19 22:18:00 kikimr0056 kernel: [288420.758377] [<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758384] [<ffffffffa039ac07>] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758392] [<ffffffffa039bb13>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758400] [<ffffffffa03ab745>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758402] [<ffffffff811c0c18>] super_cache_scan+0x178/0x180
Mar 19 22:18:00 kikimr0056 kernel: [288420.758404] [<ffffffff8115912e>] shrink_slab_node+0x14e/0x340
Mar 19 22:18:00 kikimr0056 kernel: [288420.758406] [<ffffffff811afc3b>] ? mem_cgroup_iter+0x16b/0x450
Mar 19 22:18:00 kikimr0056 kernel: [288420.758408] [<ffffffff8115af70>] shrink_slab+0x100/0x140
Mar 19 22:18:00 kikimr0056 kernel: [288420.758410] [<ffffffff8115e425>] do_try_to_free_pages+0x335/0x490
Mar 19 22:18:00 kikimr0056 kernel: [288420.758412] [<ffffffff8115e7f9>] try_to_free_pages+0xb9/0x1f0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758414] [<ffffffff816d56e4>] ? __alloc_pages_direct_compact+0x69/0x1be
Mar 19 22:18:00 kikimr0056 kernel: [288420.758416] [<ffffffff81150cba>] __alloc_pages_nodemask+0x69a/0xb40
Mar 19 22:18:00 kikimr0056 kernel: [288420.758420] [<ffffffff8119743e>] alloc_pages_current+0x9e/0x110
Mar 19 22:18:00 kikimr0056 kernel: [288420.758422] [<ffffffff811a0ac5>] new_slab+0x2c5/0x390
Mar 19 22:18:00 kikimr0056 kernel: [288420.758424] [<ffffffff816d71c4>] __slab_alloc+0x33b/0x459
Mar 19 22:18:00 kikimr0056 kernel: [288420.758427] [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758430] [<ffffffff8164bda1>] ? inet_sendmsg+0x71/0xc0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758431] [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758433] [<ffffffff811a21f2>] kmem_cache_alloc+0x1a2/0x1b0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758435] [<ffffffff815b906d>] sock_alloc_inode+0x2d/0xd0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758438] [<ffffffff811d8566>] alloc_inode+0x26/0xa0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758440] [<ffffffff811da04a>] new_inode_pseudo+0x1a/0x70
Mar 19 22:18:00 kikimr0056 kernel: [288420.758441] [<ffffffff815b933e>] sock_alloc+0x1e/0x80
Mar 19 22:18:00 kikimr0056 kernel: [288420.758444] [<ffffffff815ba855>] __sock_create+0x95/0x220
Mar 19 22:18:00 kikimr0056 kernel: [288420.758446] [<ffffffff815baa04>] sock_create_kern+0x24/0x30
Mar 19 22:18:00 kikimr0056 kernel: [288420.758448] [<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758451] [<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
Mar 19 22:18:00 kikimr0056 kernel: [288420.758452] [<ffffffff81084c19>] process_one_work+0x159/0x4f0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758454] [<ffffffff8108561b>] worker_thread+0x11b/0x530
Mar 19 22:18:00 kikimr0056 kernel: [288420.758455] [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758457] [<ffffffff8108b6f9>] kthread+0xc9/0xe0
Mar 19 22:18:00 kikimr0056 kernel: [288420.758459] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.758461] [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.758462] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

Mar 19 22:18:00 kikimr0056 kernel: [288420.762055] Workqueue: ceph-msgr con_work [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.762056] ffff88014a89fc08 0000000000000046 ffff88014a89fc18 ffff88013a2d90c0
Mar 19 22:18:00 kikimr0056 kernel: [288420.762057] 0000000000012b00 0000000000000000 ffff88014a89ffd8 0000000000012b00
Mar 19 22:18:00 kikimr0056 kernel: [288420.762059] ffff880015a210c0 ffff88013a2d90c0 0000000000000000 ffff882028a84798
Mar 19 22:18:00 kikimr0056 kernel: [288420.762060] Call Trace:
Mar 19 22:18:00 kikimr0056 kernel: [288420.762062] [<ffffffff816dd629>] schedule+0x29/0x70
Mar 19 22:18:00 kikimr0056 kernel: [288420.762064] [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
Mar 19 22:18:00 kikimr0056 kernel: [288420.762065] [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
Mar 19 22:18:00 kikimr0056 kernel: [288420.762067] [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
Mar 19 22:18:00 kikimr0056 kernel: [288420.762070] [<ffffffffa047ec1f>] alloc_msg+0xcf/0x210 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.762072] [<ffffffffa0479c55>] con_work+0x1675/0x2050 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.762074] [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
Mar 19 22:18:00 kikimr0056 kernel: [288420.762076] [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
Mar 19 22:18:00 kikimr0056 kernel: [288420.762078] [<ffffffff81084c19>] process_one_work+0x159/0x4f0
Mar 19 22:18:00 kikimr0056 kernel: [288420.762079] [<ffffffff8108561b>] worker_thread+0x11b/0x530
Mar 19 22:18:00 kikimr0056 kernel: [288420.762081] [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
Mar 19 22:18:00 kikimr0056 kernel: [288420.762082] [<ffffffff8108b6f9>] kthread+0xc9/0xe0
Mar 19 22:18:00 kikimr0056 kernel: [288420.762084] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.762086] [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.762088] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

Mar 19 22:18:00 kikimr0056 kernel: [288420.765648] Workqueue: ceph-msgr con_work [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.765648] ffff88014c10fc08 0000000000000046 ffff88013a2d9988 ffff88013a2d9920
Mar 19 22:18:00 kikimr0056 kernel: [288420.765650] 0000000000012b00 0000000000000000 ffff88014c10ffd8 0000000000012b00
Mar 19 22:18:00 kikimr0056 kernel: [288420.765651] ffffffff81c1b4a0 ffff88013a2d9920 0000000000000000 ffff882028a84798
Mar 19 22:18:00 kikimr0056 kernel: [288420.765653] Call Trace:
Mar 19 22:18:00 kikimr0056 kernel: [288420.765655] [<ffffffff816dd629>] schedule+0x29/0x70
Mar 19 22:18:00 kikimr0056 kernel: [288420.765656] [<ffffffff816dd906>] schedule_preempt_disabled+0x16/0x20
Mar 19 22:18:00 kikimr0056 kernel: [288420.765658] [<ffffffff816df755>] __mutex_lock_slowpath+0xa5/0x110
Mar 19 22:18:00 kikimr0056 kernel: [288420.765659] [<ffffffff816df7f6>] mutex_lock+0x36/0x4a
Mar 19 22:18:00 kikimr0056 kernel: [288420.765662] [<ffffffffa047ec1f>] alloc_msg+0xcf/0x210 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.765664] [<ffffffffa0479c55>] con_work+0x1675/0x2050 [libceph]
Mar 19 22:18:00 kikimr0056 kernel: [288420.765667] [<ffffffff810a076c>] ? put_prev_entity+0x3c/0x2e0
Mar 19 22:18:00 kikimr0056 kernel: [288420.765668] [<ffffffff8109b315>] ? sched_clock_cpu+0x95/0xd0
Mar 19 22:18:00 kikimr0056 kernel: [288420.765670] [<ffffffff81084c19>] process_one_work+0x159/0x4f0
Mar 19 22:18:00 kikimr0056 kernel: [288420.765671] [<ffffffff8108561b>] worker_thread+0x11b/0x530
Mar 19 22:18:00 kikimr0056 kernel: [288420.765673] [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
Mar 19 22:18:00 kikimr0056 kernel: [288420.765675] [<ffffffff8108b6f9>] kthread+0xc9/0xe0
Mar 19 22:18:00 kikimr0056 kernel: [288420.765676] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.765678] [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
Mar 19 22:18:00 kikimr0056 kernel: [288420.765680] [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90

#2 Updated by Ilya Dryomov 5 months ago

  • Status changed from New to Verified

Hi Sergey,

The title says "raw rbd device", but the traces suggest that there is an xfs filesystem on top of the device. Is that so or do you have multiple devices mapped, with one or more of them formatted as xfs?

I think I see the problem -- I'll work on a fix in the next couple of days. The workaround is to lighten the load on the kernel client node to relieve some of the memory pressure which is causing the bug to occur.

#3 Updated by Sergey Jerusalimov 5 months ago

Ilya, Hi.

I have three rbd devices mapped on this host. they are formatted to xfs and mounted,
It's stuck on "rm -rf" operation at this time

#4 Updated by Sergey Jerusalimov 5 months ago

sorry for "raw" mistake in subject

#5 Updated by Ilya Dryomov 5 months ago

  • Status changed from Verified to In Progress
  • Priority changed from Normal to Urgent

#6 Updated by Ilya Dryomov 5 months ago

  • Status changed from In Progress to Need Review

[PATCH] libceph: force GFP_NOIO for socket allocations

#9 Updated by Ilya Dryomov 4 months ago

  • Status changed from Pending Backport to Resolved

In 4.4.60, 4.9.21, 4.10.9.

Also available in: Atom PDF