Project

General

Profile

Actions

Bug #16579

closed

OSDs crash with task_numa_find_cpu with kernel-4.4.13

Added by Vikhyat Umrao almost 8 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Support
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

- OSDs crash with task_numa_find_cpu with kernel-4.4.13
- This issue does not occur with kernel 4.2.x.

Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547285] divide error: 0000 [#1] SMP
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547305] Modules linked in: xfs libcrc32c mptctl mptbase ipmi_devintf ipmi_ssif intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac edac_core input_leds joydev lpc_ich mei_me mei ioatdma shpchp ipmi_si ipmi_msghandler 8250_fintek 8021q garp mrp stp wmi llc mac_hid bonding lp parport mlx4_en vxlan ip6_udp_tunnel udp_tunnel ses enclosure hid_generic usbhid hid igb mlx4_core mpt3sas isci i2c_algo_bit dca ahci libsas ptp raid_class libahci megaraid_sas pps_core scsi_transport_sas fjes

Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547565] CPU: 12 PID: 31801 Comm: ceph-osd Not tainted 4.4.13-040413-generic #201606072354

Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547582] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2 03/04/2015
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547600] task: ffff881057536e00 ti: ffff880fac400000 task.ti: ffff880fac400000
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547614] RIP: 0010:[<ffffffff810b416e>]  [<ffffffff810b416e>] task_numa_find_cpu+0x22e/0x6f0
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547641] RSP: 0000:ffff880fac403bd8  EFLAGS: 00010206
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547667] RAX: 0000000000000000 RBX: ffff880fac403c78 RCX: 0000000000000000
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547696] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880e8fb71800
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547726] RBP: ffff880fac403c40 R08: 0000000101293c9f R09: 0000000000ff00ff
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547755] R10: 0000000000000015 R11: 0000000000000007 R12: 0000000000000000
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547785] R13: 0000000000000001 R14: 00000000000002e8 R15: 00000000000002e9
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547815] FS:  00007fae4bcb6700(0000) GS:ffff88105f300000(0000) knlGS:0000000000000000
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547860] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547887] CR2: 0000000037ba1c00 CR3: 00000004912e7000 CR4: 00000000000406e0
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547916] Stack:
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547937]  ffff880fac403c48 ffff880fac403c40 0000000000000000 ffff881057536e00
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.547988]  00000000000002e9 fffffffffffffe79 0000000000016b00 00000000000002e9
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548039]  ffff881057536e00 ffff880fac403c78 00000000000002b7 000000000000007f
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548090] Call Trace:
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548116]  [<ffffffff810b4a6e>] task_numa_migrate+0x43e/0x9b0
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548145]  [<ffffffff810b5059>] numa_migrate_preferred+0x79/0x80
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548174]  [<ffffffff810b9b94>] task_numa_fault+0x7f4/0xd40
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548203]  [<ffffffff810b9205>] ? should_numa_migrate_memory+0x55/0x130
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548235]  [<ffffffff811bd590>] handle_mm_fault+0xbc0/0x1820
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548264]  [<ffffffff816e0ff4>] ? SYSC_recvfrom+0x144/0x160
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548294]  [<ffffffff8106a537>] __do_page_fault+0x197/0x400
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548322]  [<ffffffff8106a7c2>] do_page_fault+0x22/0x30
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548350]  [<ffffffff8180a878>] page_fault+0x28/0x30
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548376] Code: 55 b0 4c 89 f7 e8 53 cd ff ff 48 8b 55 b0 49 8b 4e 78 48 8b 82 d8 01 00 00 48 83 c1 01 31 d2 49 0f af 86 b0 00 00 00 4c 8b 73 78 <48> f7 f1 48 8b 4b 20 49 89 c0 48 29 c1 48 8b 45 d0 4c 03 43 48
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548601] RIP  [<ffffffff810b416e>] task_numa_find_cpu+0x22e/0x6f0
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548632]  RSP <ffff880fac403bd8>
Jun 26 13:27:27 roc05r-sca110 kernel: [78218.548978] ---[ end trace 70d1647d8dd2339a ]---

-Ceph Version:* 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
-Kernel Version:* 4.4.13-040413-generic

Actions

Also available in: Atom PDF