Project

General

Profile

Actions

Bug #18092

closed

unittest_erasure_code_shec illegal instruction

Added by John Coyle over 7 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

build@247e7c628ef2:~/ceph-ubuntu-14.04-build/build$ ./bin/unittest_erasure_code_shec
[==========] Running 71 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 71 tests from ErasureCodeShec
[ RUN ] ErasureCodeShec.init_1
[ OK ] ErasureCodeShec.init_1 (2 ms)
[ RUN ] ErasureCodeShec.init_2
[ OK ] ErasureCodeShec.init_2 (0 ms)
[ RUN ] ErasureCodeShec.init_3
[ OK ] ErasureCodeShec.init_3 (1 ms)
[ RUN ] ErasureCodeShec.init_4
  • Caught signal (Illegal instruction)
    in thread 7f2bb6eabd00 thread_name:unittest_erasur
    ceph version 11.0.2-2111-gb3e2719 (b3e2719abddc349e2df6327256c461ba9b779fcc)
    1: (()+0x191282) [0x558167322282]
    2: (()+0x10330) [0x7f2bb6a90330]
    3: (()+0x15cb1) [0x7f2bb6841cb1]
    4: (reed_sol_extended_vandermonde_matrix()+0x12b) [0x7f2bb685cadb]
    5: (reed_sol_big_vandermonde_distribution_matrix()+0x32) [0x7f2bb685cb52]
    6: (reed_sol_vandermonde_coding_matrix()+0x1a) [0x7f2bb685cfaa]
    7: (ErasureCodeShec::shec_reedsolomon_coding_matrix(int)+0xee) [0x7f2bb686213e]
    8: (ErasureCodeShecReedSolomonVandermonde::prepare()+0x233) [0x7f2bb6863473]
    9: (ErasureCodeShec::init(std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > >&, std::ostream*)+0x150) [0x7f2bb6866d60]
    10: (ErasureCodeShec_init_4_Test::TestBody()+0x473) [0x5581672d59f3]
    11: (void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x33) [0x558167319c73]
    12: (testing::Test::Run()+0xb7) [0x55816730cd27]
    13: (testing::TestInfo::Run()+0x9e) [0x55816730cdce]
    14: (testing::TestCase::Run()+0xa5) [0x55816730ced5]
    15: (testing::internal::UnitTestImpl::RunAllTests()+0x248) [0x55816730d188]
    16: (testing::UnitTest::Run()+0x54) [0x55816730d444]
    17: (main()+0x11c) [0x5581672b4e7c]
    18: (__libc_start_main()+0xf5) [0x7f2bb50caf45]
    19: (()+0x126526) [0x5581672b7526]
    2016-11-30 15:41:29.113094 7f2bb6eabd00 -1
    Caught signal (Illegal instruction) *
    in thread 7f2bb6eabd00 thread_name:unittest_erasur

I think it's here

https://github.com/ceph/jerasure/blob/02731df4c1eae1819c4453c9d3ab6d408cadd085/src/galois.c#L262

Actions #1

Updated by John Coyle over 7 years ago

Caused by a pclmul mis detect

gf_cpu_identify() output:

#gf_cpu_supports_intel_pclmul
#gf_cpu_supports_intel_sse4
#gf_cpu_supports_intel_ssse3
#gf_cpu_supports_intel_sse3
#gf_cpu_supports_intel_sse2

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dtherm tpr_shadow vnmi flexpriority ept vpid

With

https://github.com/ceph/gf-complete/pull/9

#gf_cpu_supports_intel_sse4
#gf_cpu_supports_intel_ssse3
#gf_cpu_supports_intel_sse3
#gf_cpu_supports_intel_sse2

Actions #2

Updated by Loïc Dachary over 7 years ago

  • Status changed from New to In Progress
  • Assignee set to Loïc Dachary
  • Priority changed from Normal to Urgent
Actions #3

Updated by Loïc Dachary over 7 years ago

Actions #4

Updated by Loïc Dachary over 7 years ago

  • Status changed from In Progress to Fix Under Review

https://github.com/ceph/ceph/pull/12382

For the record there is no need for backport because the root cause was introduced post jewel

Actions #5

Updated by Sage Weil over 7 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF