Project

General

Profile

Actions

Bug #14548

closed

mira052 MCE

Added by David Galloway about 8 years ago. Updated about 8 years ago.

Status:
Can't reproduce
Priority:
Low
Category:
Test Node
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

<3>[151160.634220] EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0
<3>[151169.648881] EDAC i7core: New Corrected error(s): dimm0: +2, dimm1: +0, dimm2 +0
<3>[151233.705374] EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0
<3>[151290.752639] EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0
<3>[151335.790633] EDAC i7core: New Corrected error(s): dimm0: +3, dimm1: +0, dimm2 +0
<3>[151373.823463] EDAC i7core: New Corrected error(s): dimm0: +4, dimm1: +0, dimm2 +0
<3>[151420.863103] EDAC i7core: New Corrected error(s): dimm0: +3, dimm1: +0, dimm2 +0
<3>[151568.979022] EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0
<3>[151601.011253] EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0
<3>[151629.040387] EDAC i7core: New Corrected error(s): dimm0: +2, dimm1: +0, dimm2 +0
<3>[151634.048130] EDAC i7core: New Corrected error(s): dimm0: +3, dimm1: +0, dimm2 +0
<3>[151724.120300] EDAC i7core: New Corrected error(s): dimm0: +3, dimm1: +0, dimm2 +0
<3>[151742.137963] EDAC i7core: New Corrected error(s): dimm0: +2, dimm1: +0, dimm2 +0
<3>[151789.177528] EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0
<4>[155346.707212] Disabling lock debugging due to kernel taint
<0>[155346.707224] mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 8: fe002d000001009f
<0>[155346.715874] mce: [Hardware Error]: TSC 152fd22cd9ca3 ADDR 122ddcb00 MISC e3b4783000001283 
<0>[155346.724328] mce: [Hardware Error]: PROCESSOR 0:106e5 TIME 1453137066 SOCKET 0 APIC 0 microcode 3
<0>[155346.733236] mce: [Hardware Error]: Machine check: Processor context corrupt
<0>[155346.740307] Kernel panic - not syncing: Fatal Machine check
[dumpcommon]kdb>   -bt

Stack traceback for pid 25331
0xffff8804282a4800    25331    25308  1    1   R  0xffff8804282a4ce8 *ceph-osd
 ffff88043fc48d50 0000000000000018
Call Trace:
 <#DB>  <<EOE>>  <#MC>  [<ffffffff81102c79>] ? kgdb_panic_event+0x29/0x30
 [<ffffffff8173125c>] ? notifier_call_chain+0x4c/0x70
 [<ffffffff817312ba>] ? atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff8171da17>] ? panic+0xec/0x1d7
 [<ffffffff8171e51e>] ? printk+0x67/0x69
 [<ffffffff81036e5a>] ? mce_panic+0x1fa/0x210
 [<ffffffff81038ca4>] ? do_machine_check+0xaa4/0xab0
 [<ffffffff8172d43f>] ? machine_check+0x1f/0x30
 [<ffffffff81372020>] ? copy_user_generic_string+0x30/0x40
 <<EOE>>  [<ffffffff8171f0ca>] ? __iovec_copy_from_user_inatomic+0x44/0x72
 [<ffffffff8114f296>] ? iov_iter_copy_from_user_atomic+0x86/0x90
 [<ffffffff8114fa43>] ? generic_file_buffered_write+0x133/0x250
 [<ffffffffa0497027>] ? xfs_file_buffered_aio_write+0x107/0x1a0 [xfs]
 [<ffffffffa0497180>] ? xfs_file_aio_write+0xc0/0x120 [xfs]
 [<ffffffff811bde0c>] ? do_sync_readv_writev+0x4c/0x80
 [<ffffffff811bf2d0>] ? do_readv_writev+0xb0/0x220
 [<ffffffffa04970c0>] ? xfs_file_buffered_aio_write+0x1a0/0x1a0 [xfs]
 [<ffffffff811bdd30>] ? do_sync_read+0x90/0x90
 [<ffffffff811c011e>] ? __fput+0x17e/0x260
 [<ffffffff811c024e>] ? ____fput+0xe/0x10
 [<ffffffff811bf4c0>] ? vfs_writev+0x30/0x60
 [<ffffffff811bf5f9>] ? SyS_writev+0x49/0xc0
 [<ffffffff8173575d>] ? system_call_fastpath+0x1a/0x1f

Actions #1

Updated by David Galloway about 8 years ago

  • Subject changed from mira054 MCE to mira052 MCE
Actions #2

Updated by David Galloway about 8 years ago

  • Status changed from New to Can't reproduce

memtest passed. Will monitor host for future failures and troubleshoot further if needed.

Reinstalled and released.

Actions

Also available in: Atom PDF