Bug #7826
closedosd: illegal instruction in jerasure
90%
Description
-2> 2014-03-23 18:34:04.364656 7fc983691700 10 osd.3 pg_epoch: 16 pg[3.2s0( empty local-les=8 n=1 ec=7 les/c 8/8 7/7/7) [3,5,0] r=0 lpr=7 crt=0'0 mlcod 0'0 active+clean] submit_transaction: op Op(1d10095a/plana1826792-40/head//3 v=16'1 tt=0'0 tid=4 reqid=osd.1.3:53 client_op=osd_op(osd.1.3:53 plana1826792-40 [copy-from ver 70] 3.1d10095a ondisk+write+ignore_overlay e16) pending_commit= pending_apply=) starting -1> 2014-03-23 18:34:04.364770 7fc983691700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 0> 2014-03-23 18:34:04.369778 7fc983691700 -1 *** Caught signal (Illegal instruction) ** in thread 7fc983691700 ceph version 0.78-314-g43ddf14 (43ddf14b64b386fa68bbbab84afeb0c17bd86d33) 1: ceph-osd() [0x9809da] 2: (()+0x364a0) [0x7fc999c414a0] 3: (jerasure_matrix_dotprod()+0xb4) [0x7fc97d6dd5d4] 4: (jerasure_matrix_encode()+0x72) [0x7fc97d6dd812] 5: (ErasureCodeJerasureReedSolomonVandermonde::jerasure_encode(char**, char**, int)+0x24) [0x7fc97d6ffac4] 6: (ErasureCodeJerasure::encode(std::set<int, std::less<int>, std::allocator<int> > const&, ceph::buffer::list const&, std::map<int, ceph::buffer::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::list> > >*)+0x576) [0x7fc97d7049c6] 7: (ECUtil::encode(ECUtil::stripe_info_t const&, std::tr1::shared_ptr<ceph::ErasureCodeInterface>&, ceph::buffer::list&, std::set<int, std::less<int>, std::allocator<int> > const&, std::map<int, ceph::buffer::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::list> > >*)+0x193) [0x93c0d3] 8: (TransGenerator::operator()(ECTransaction::AppendOp const&)+0x261) [0x93a0a1] 9: (ECTransaction::generate_transactions(std::map<hobject_t, std::tr1::shared_ptr<ECUtil::HashInfo>, std::less<hobject_t>, std::allocator<std::pair<hobject_t const, std::tr1::shared_ptr<ECUtil::HashInfo> > > >&, std::tr1::shared_ptr<ceph::ErasureCodeInterface>&, pg_t, ECUtil::stripe_info_t const&, std::map<unsigned char, ObjectStore::Transaction, std::less<unsigned char>, std::allocator<std::pair<unsigned char const, ObjectStore::Transaction> > >*, std::set<hobject_t, std::less<hobject_t>, std::allocator<hobject_t> >*, std::set<hobject_t, std::less<hobject_t>, std::allocator<hobject_t> >*, std::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >*) const+0x17f) [0x938f4f] 10: (ECBackend::start_write(ECBackend::Op*)+0x53f) [0x91486f] 11: (ECBackend::submit_transaction(hobject_t const&, eversion_t const&, PGBackend::PGTransaction*, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, Context*, Context*, Context*, unsigned long, osd_reqid_t, std::tr1::shared_ptr<OpRequest>)+0xe9b) [0x9228ab] 12: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*, utime_t)+0x7f7) [0x7d46d7] 13: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x1a35) [0x811315] 14: (CopyFromCallback::finish(boost::tuples::tuple<int, ReplicatedPG::CopyResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>)+0x4a) [0x8550ca] 15: (GenContext<boost::tuples::tuple<int, ReplicatedPG::CopyResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tup les::null_type> >::complete(boost::tuples::tuple<int, ReplicatedPG::CopyResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>)+0x15) [0x82c425] 16: (ReplicatedPG::process_copy_chunk(hobject_t, unsigned long, int)+0xb5d) [0x7f5aad] 17: (C_Copyfrom::finish(int)+0xa7) [0x853d07] 18: (Context::complete(int)+0x9) [0x66e799] 19: (Finisher::finisher_thread_entry()+0x1c0) [0x9a2ee0] 20: (()+0x7e9a) [0x7fc99b73ee9a] 21: (clone()+0x6d) [0x7fc999cff3fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Sage Weil about 10 years ago
- Status changed from 12 to Fix Under Review
Updated by Sage Weil about 10 years ago
- Status changed from Fix Under Review to 12
Updated by Loïc Dachary about 10 years ago
It would be helpful to have the config yaml used to trigger this
ubuntu@teuthology:/a$ find teuthology-2014-03-{19,21}*rados-master* -name teuthology.log | xargs grep ErasureCodeJerasureReedSolomonVandermonde
found nothing.
Updated by Loïc Dachary about 10 years ago
work in progress to run teuthology workloads
Updated by Loïc Dachary about 10 years ago
fixed a few things in gf_complete but none of that looks like it could cause an illegal instruction
Updated by Sage Weil about 10 years ago
this is what i triggered it with. note that yo uneed to adjust the branch to be something that doesn't include the workaround. and that it will crash later due to another bug #7823 if jerasure is working again.
interactive-on-error: true overrides: admin_socket: branch: master ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 osd op thread timeout: 60 osd sloppy crc: true fs: btrfs log-whitelist: - slow request install: ceph: branch: wip-ec-copy-get s3tests: branch: master owner: scheduled_sage@flab roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - client.0 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.1 tasks: - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 2 chance_pgpnum_fix: 1 timeout: 1200 - exec: client.0: - ceph osd erasure-code-profile set teuthologyprofile ruleset-failure-domain=osd m=1 k=2 - ceph osd pool create base 4 erasure teuthologyprofile - ceph osd pool create cache 4 - ceph osd tier add base cache - ceph osd tier cache-mode cache writeback - ceph osd tier set-overlay base cache - ceph osd pool set cache hit_set_type bloom - ceph osd pool set cache hit_set_count 8 - ceph osd pool set cache hit_set_period 60 - ceph osd pool set cache target_max_objects 250 - rados: clients: - client.0 max_in_flight: 64 max_seconds: 600 objects: 1024 op_weights: append: 100 copy_from: 50 delete: 50 read: 100 rmattr: 25 rollback: 50 setattr: 25 snap_create: 50 snap_remove: 50 write: 0 ops: 400000 pools: - base size: 16384
Updated by Loïc Dachary about 10 years ago
running this teuthology workload on http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-precise-amd64-basic/log.cgi?log=236a505d81382aae2a29d616ffd471b380f580af
Updated by Loïc Dachary about 10 years ago
ceph osd pool create base 4 erasure teuthologyprofile
should be
ceph osd pool create base 4 *4* erasure teuthologyprofile
and
size: 16384
should be
object_size: 16384
Updated by Loïc Dachary about 10 years ago
rm -fr dev out ; mkdir -p dev ; LC_ALL=C MON=1 OSD=6 bash -x ./vstart.sh -d -n -X -l mon osd ceph osd erasure-code-profile set teuthologyprofile ruleset-failure-domain=osd m=1 k=2 ceph osd pool create base 4 4 erasure teuthologyprofile ceph osd pool create cache 4 ceph osd tier add base cache ceph osd tier cache-mode cache writeback ceph osd tier set-overlay base cache ceph osd pool set cache hit_set_type bloom ceph osd pool set cache hit_set_count 8 ceph osd pool set cache hit_set_period 60 ceph osd pool set cache target_max_objects 250 ./ceph_test_rados --max-in-flight 1 --max-seconds 600 --objects 1024 --op append 100 --op copy_from 50 --op delete 50 --op read 100 --op rmattr 25 --op rollback 50 --op setattr 25 --op snap_create 50 --op snap_remove 50 --op write 0 --max-ops 10000 --size 16384 --pool base
crashes with
osd/ReplicatedPG.cc: 5551: FAILED assert(cop->data.length() + cop->temp_cursor.data_offset == cop->cursor.data_offset)
as in http://tracker.ceph.com/issues/7823
using https://github.com/ceph/ceph/tree/wip-sse-fix
Now trying to trigger the "illegal instruction".
Updated by Loïc Dachary about 10 years ago
On master ( afcf016a8a814167ced178d8a4275fa2a6f94a4d ) it fails in the same way using the set of command above, with or without AX_SSE() in configure.ac. Maybe the binary should be run on a CPU that does not have SSE extensions to trigger the problem.
Updated by Loïc Dachary about 10 years ago
I thought that maybe
$ ceph osd pool create base 4 erasure teuthologyprofile erasure not valid: erasure doesn't represent an int pool 'base' created
could cause the profile to not be used and lead to an incorrect failure domain and therefore an erasure coded pool with not enough OSDs. But it looks like the error is automatically repaired.
Updated by Loïc Dachary about 10 years ago
Reproduced the bug on master in a kvm without SSE instructions :-)
ceph version 0.78-321-gafcf016 (afcf016a8a814167ced178d8a4275fa2a6f94a4d) 1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x1626461] 2: ./ceph-osd() [0x173fed6] 3: (()+0xfbd0) [0x7f57bd7bebd0] 4: (jerasure_matrix_dotprod()+0x197) [0x7f579efbaef2] 5: (jerasure_matrix_encode()+0xb3) [0x7f579efb9db6] 6: (ErasureCodeJerasureReedSolomonVandermonde::jerasure_encode(char**, char**, int)+0x49) [0x7f579f021c17] 7: (ErasureCodeJerasure::encode(std::set<int, std::less<int>, std::allocator<int> > const&, ceph::buffer::list const&, std::map<int, ceph::buffer::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::list> > >*)+0x45f) [0x7f579f020d41]
Updated by Loïc Dachary about 10 years ago
reproduced on https://github.com/dachary/ceph/tree/wip-sse-fix
Updated by Loïc Dachary about 10 years ago
ceph/src/unittest_erasure_code_jerasure fails consistently
#0 jerasure_matrix_dotprod (k=2, w=8, matrix_row=0x1cd4170, src_ids=0x0, dest_id=2, data_ptrs=0x7fffffffdcb0, coding_ptrs=0x7fffffffdcc0, size=160) at erasure-code/jerasure/jerasure/src/jerasure.c:604 #1 0x000000000086de94 in jerasure_matrix_encode (k=2, m=2, w=8, matrix=0x1cd4170, data_ptrs=0x7fffffffdcb0, coding_ptrs=0x7fffffffdcc0, size=160) at erasure-code/jerasure/jerasure/src/jerasure.c:310 #2 0x00000000008d619d in ErasureCodeJerasureReedSolomonVandermonde::jerasure_encode ( this=0x7fffffffe200, data=0x7fffffffdcb0, coding=0x7fffffffdcc0, blocksize=160) at erasure-code/jerasure/ErasureCodeJerasure.cc:230 #3 0x00000000008d530f in ErasureCodeJerasure::encode (this=0x7fffffffe200, want_to_encode=..., in=..., encoded=0x7fffffffe140) at erasure-code/jerasure/ErasureCodeJerasure.cc:129 #4 0x000000000086996e in ErasureCodeTest_encode_decode_Test<ErasureCodeJerasureReedSolomonVandermonde>::TestBody (this=0x1cd4180) at test/erasure-code/TestErasureCodeJerasure.cc:68 #5 0x00000000008de490 in testing::Test::Run (this=0x1cd4180) at ./src/gtest.cc:2095 #6 0x00000000008de9fc in testing::internal::TestInfoImpl::Run (this=0x1ce8000) at ./src/gtest.cc:2314 #7 0x00000000008def0f in testing::TestCase::Run (this=0x1cf0000) at ./src/gtest.cc:2420 #8 0x00000000008e34b0 in testing::internal::UnitTestImpl::RunAllTests (this=0x1cec000) at ./src/gtest.cc:4024 #9 0x00000000008e23fc in testing::UnitTest::Run ( this=0xd27bc0 <testing::UnitTest::GetInstance()::instance>) at ./src/gtest.cc:3687 #10 0x0000000000849040 in main (argc=1, argv=0x7fffffffe528) at test/erasure-code/TestErasureCodeJerasure.cc:350
Updated by Loïc Dachary about 10 years ago
0x86efa6 <jerasure_matrix_dotprod+375> jne 0x86efe3 <jerasure_matrix_dotprod+436> 0x86efa8 <jerasure_matrix_dotprod+377> mov 0x18(%rbp),%eax 0x86efab <jerasure_matrix_dotprod+380> movslq %eax,%rdx 0x86efae <jerasure_matrix_dotprod+383> mov -0x10(%rbp),%rcx 0x86efb2 <jerasure_matrix_dotprod+387> mov -0x8(%rbp),%rax 0x86efb6 <jerasure_matrix_dotprod+391> mov %rcx,%rsi 0x86efb9 <jerasure_matrix_dotprod+394> mov %rax,%rdi 0x86efbc <jerasure_matrix_dotprod+397> callq 0x8448f0 <memcpy@plt> 0x86efc1 <jerasure_matrix_dotprod+402> vcvtsi2sdl 0x18(%rbp),%xmm0,%xmm0 0x86efc6 <jerasure_matrix_dotprod+407> vmovsd 0x4b8962(%rip),%xmm1 # 0xd27930 <jerasure_total_memcpy_bytes> 0x86efce <jerasure_matrix_dotprod+415> vaddsd %xmm1,%xmm0,%xmm0 0x86efd2 <jerasure_matrix_dotprod+419> vmovsd %xmm0,0x4b8956(%rip) # 0xd27930 <jerasure_total_memcpy_bytes> 0x86efda <jerasure_matrix_dotprod+427> movl $0x1,-0x18(%rbp) 0x86efe1 <jerasure_matrix_dotprod+434> jmp 0x86f012 <jerasure_matrix_dotprod+483> 0x86efe3 <jerasure_matrix_dotprod+436> mov 0x18(%rbp),%edx 0x86efe6 <jerasure_matrix_dotprod+439> mov -0x8(%rbp),%rcx 0x86efea <jerasure_matrix_dotprod+443> mov -0x10(%rbp),%rax 0x86efee <jerasure_matrix_dotprod+447> mov %rcx,%rsi 0x86eff1 <jerasure_matrix_dotprod+450> mov %rax,%rdi 0x86eff4 <jerasure_matrix_dotprod+453> callq 0x86d267 <galois_region_xor> 0x86eff9 <jerasure_matrix_dotprod+458> *vcvtsi2sdl* 0x18(%rbp),%xmm0,%xmm0 0x86effe <jerasure_matrix_dotprod+463> vmovsd 0x4b891a(%rip),%xmm1 # 0xd27920 <jerasure_total_xor_bytes> 0x86f006 <jerasure_matrix_dotprod+471> vaddsd %xmm1,%xmm0,%xmm0 0x86f00a <jerasure_matrix_dotprod+475> vmovsd %xmm0,0x4b890e(%rip) # 0xd27920 <jerasure_total_xor_bytes> 0x86f012 <jerasure_matrix_dotprod+483> addl $0x1,-0x14(%rbp) 0x86f016 <jerasure_matrix_dotprod+487> mov -0x14(%rbp),%eax
Updated by Loïc Dachary about 10 years ago
- Status changed from 12 to In Progress
- % Done changed from 0 to 70
Updated by Loïc Dachary about 10 years ago
- Status changed from In Progress to 7
- % Done changed from 70 to 80
Updated by Loïc Dachary about 10 years ago
For the record, the illegal instruction was on an AVX instruction. It was part of the flags being set at compile time (at the very end of the .m4 file) but not tested at run time, hence used regardless and leading to the problem. AVX detection has been removed.
Updated by Loïc Dachary about 10 years ago
- Status changed from 7 to Fix Under Review
- % Done changed from 80 to 90
Updated by Sage Weil about 10 years ago
- Status changed from Fix Under Review to Resolved