Bug #7914
closedosd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096
Added by Sage Weil about 10 years ago. Updated about 10 years ago.
100%
Description
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154379
-1565> 2014-03-30 07:50:43.679454 7f4741ecb700 10 osd.0 pg_epoch: 9 pg[3.fs0( empty local-les=9 n=1 ec=8 les/c 9/9 8/8/8) [0,5,4] r=0 lpr=8 crt=0'0 mlcod 0'0 active+clean] submit_transaction: op Op(1ef9bbdf/benchmark_data_plana15_2897_object3/head//3 v=9'1 tt=0'0 tid=2 reqid=client.4123.0:4 client_op=osd_op(client .4123.0:4 benchmark_data_plana15_2897_object3 [write 0~4194304] 3.1ef9bbdf ack+ondisk+write e8) pending_commit= pending_apply=) starting -1564> 2014-03-30 07:50:43.679492 7f4741ecb700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1563> 2014-03-30 07:50:43.679492 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1562> 2014-03-30 07:50:43.679530 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1561> 2014-03-30 07:50:43.679538 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1560> 2014-03-30 07:50:43.679544 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1559> 2014-03-30 07:50:43.679550 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1558> 2014-03-30 07:50:43.679556 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1557> 2014-03-30 07:50:43.679561 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1556> 2014-03-30 07:50:43.679566 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1555> 2014-03-30 07:50:43.679572 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1554> 2014-03-30 07:50:43.679577 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1553> 2014-03-30 07:50:43.679582 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1552> 2014-03-30 07:50:43.679588 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1551> 2014-03-30 07:50:43.679593 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 -1550> 2014-03-30 07:50:43.679598 7f47416ca700 10 ErasureCodeJerasure: encode adjusted buffer length from 4096 to 4096 ... -239> 2014-03-30 07:50:43.697849 7f4741ecb700 -1 *** Caught signal (Segmentation fault) ** in thread 7f4741ecb700 ceph version 0.78-459-ge672c52 (e672c52b4f8b945a516f2eec006e33665a08f045) 1: ceph-osd() [0x9819fa] 2: (()+0xfcb0) [0x7f4756779cb0] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Files
local.txt (1.21 MB) local.txt | teuthology log file (successfull) | Loïc Dachary, 04/01/2014 06:26 AM | |
run2.txt (3.32 MB) run2.txt | successfull run | Loïc Dachary, 04/01/2014 08:34 AM | |
ceph-osd.0.log.gz (321 KB) ceph-osd.0.log.gz | Illegal Instruction | Loïc Dachary, 04/06/2014 02:39 AM |
Updated by Sage Weil about 10 years ago
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491
Updated by Loïc Dachary about 10 years ago
- Status changed from New to In Progress
- Assignee set to Loïc Dachary
Updated by Loïc Dachary about 10 years ago
archive_path: /var/lib/teuthworker/archive/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491 description: rados/thrash/{clusters/fixed-2.yaml fs/ext4.yaml msgr-failures/few.yaml thrashers/pggrow.yaml workloads/ec-radosbench.yaml} email: null job_id: '154491' kernel: &id001 kdb: true sha1: adc8b2d4c270d7d1c04695ae856448dd11362618 last_in_suite: false machine_type: plana name: teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana nuke-on-error: true os_type: ubuntu overrides: admin_socket: branch: master ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 fs: ext4 log-whitelist: - slow request sha1: e672c52b4f8b945a516f2eec006e33665a08f045 ceph-deploy: branch: dev: master conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: e672c52b4f8b945a516f2eec006e33665a08f045 s3tests: branch: master workunit: sha1: e672c52b4f8b945a516f2eec006e33665a08f045 owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - client.0 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.1 tasks: - internal.lock_machines: - 2 - plana - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 2 chance_pgpnum_fix: 1 timeout: 1200 - radosbench: clients: - client.0 ec_pool: true time: 1800 unique_pool: true teuthology_branch: master verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.plana.16939
Updated by Loïc Dachary about 10 years ago
latest jerasure / gf-complete submodule changes includes: bestrow = 0 for jerasure and gf-complete one and two which are on gf_w128.c which is not used by ceph (gf_w32.c is).
Updated by Loïc Dachary about 10 years ago
-msse3 was not used although it should have and this was fixed march 30th i.e. after the failed radosbench. There was SSE3 intrinsics used (#if at compile time) and no -msse3.
Updated by Loïc Dachary about 10 years ago
cpuinfo on plana17 (the machine on which the core dump occured) shows:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dtherm tpr_shadow vnmi flexpriority ept vpid
Updated by Loïc Dachary about 10 years ago
Trying to reproduce the problem from the source tree on the current master with
rm -fr /tmp/dev /tmp/out ; mkdir -p /tmp/dev ; CEPH_DIR=/tmp LC_ALL=C MON=1 OSD=6 bash -x ./vstart.sh -d -n -X -l mon osd ./ceph osd erasure-code-profile set myprofile \ ruleset-failure-domain=osd || return 1 ./ceph osd erasure-code-profile get myprofile ./ceph osd pool create ecpool 12 12 erasure myprofile \ || return 1 rados --name client.0 -p ecpool bench 360 write
Updated by Loïc Dachary about 10 years ago
the /a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491 radosbench completed successfully
2014-03-30T09:59:12.768 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: 2014-03-30 09:59:12.767398min lat: 0.066051 max lat: 24.1214 avg lat: 1.35237 2014-03-30T09:59:12.768 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 2014-03-30T09:59:12.768 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: 1800 16 21280 21264 47.2462 0 - 1.35237 2014-03-30T09:59:13.768 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: 1801 16 21280 21264 47.22 0 - 1.35237 2014-03-30T09:59:14.768 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: 1802 16 21280 21264 47.1938 0 - 1.35237 2014-03-30T09:59:15.769 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Total time run: 1803.084996 2014-03-30T09:59:15.769 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Total writes made: 21281 2014-03-30T09:59:15.769 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Write size: 4194304 2014-03-30T09:59:15.769 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Bandwidth (MB/sec): 47.210 2014-03-30T09:59:15.769 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: 2014-03-30T09:59:15.770 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Stddev Bandwidth: 45.9924 2014-03-30T09:59:15.770 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Max bandwidth (MB/sec): 148 2014-03-30T09:59:15.770 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Min bandwidth (MB/sec): 0 2014-03-30T09:59:15.770 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Average Latency: 1.3556 2014-03-30T09:59:15.771 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Stddev Latency: 1.82972 2014-03-30T09:59:15.771 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Max latency: 24.1214 2014-03-30T09:59:15.771 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.131.23]: Min latency: 0.066051
Updated by Loïc Dachary about 10 years ago
both errors show with "thrashers/pggrow.yaml" (one with btrfs, the other with ext4)
Updated by Loïc Dachary about 10 years ago
/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491
2014-03-30T09:29:44.587 INFO:teuthology.task.thrashosds.thrasher:Growing pool unique_pool_0
Updated by Loïc Dachary about 10 years ago
Trying to trigger the problem again with:
machine_type: plana os_type: ubuntu overrides: admin_socket: branch: master ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 fs: ext4 log-whitelist: - slow request sha1: e672c52b4f8b945a516f2eec006e33665a08f045 ceph-deploy: branch: dev: master conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: e672c52b4f8b945a516f2eec006e33665a08f045 s3tests: branch: master workunit: sha1: e672c52b4f8b945a516f2eec006e33665a08f045 owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - client.0 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.1 tasks: - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 2 chance_pgpnum_fix: 1 timeout: 1200 - radosbench: clients: - client.0 ec_pool: true time: 1800 unique_pool: true teuthology_branch: master verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.plana.16939
Updated by Loïc Dachary about 10 years ago
ceph version 0.78-459-ge672c52 (e672c52b4f8b945a516f2eec006e33665a08f045) 1: ceph-osd() [0x9819fa] 2: (()+0xfcb0) [0x7f4756779cb0]
Maybe this is a problem in the static destructors ? Since radosbench completed successfully, it may have happened during the teardown phase. On the 27th March the ability to load multiple plugins was added and there now are two erasure code plugins loaded because the primary plugin selects another at run time and loads it
Updated by Loïc Dachary about 10 years ago
- File local.txt local.txt added
- Status changed from In Progress to Can't reproduce
- % Done changed from 0 to 10
The above teuthology workload ran on two plana machines and did not create a core dump.
Updated by Loïc Dachary about 10 years ago
I don't see why not locking the erasure-code plugin registry could be a problem in practice but it is a problem in theory, better fix it
Updated by Loïc Dachary about 10 years ago
Another run with success, complete logs in run2.txt
Updated by Loïc Dachary about 10 years ago
- Status changed from Can't reproduce to In Progress
I will run the workload many times, hoping to reproduce the crash.
Updated by Sage Weil about 10 years ago
ran through 20 iterations of the above yaml and did not reproduce.
Updated by Loïc Dachary about 10 years ago
Contrary to what I thought (reading the wrong number), plana have SSE3 indeed and the jerasure plugin including SSE4 optimizations is being run as is shown in the logs
/var/log/ceph/ceph-osd.3.log:2014-04-01 14:45:27.582145 7fbe5b687700 10 ErasureCodePluginSelectJerasure: SSE4 plugin
Updated by Loïc Dachary about 10 years ago
The only difference is that ext4 is used in teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491 and btrfs is used in teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154379
Updated by Sage Weil about 10 years ago
also happened on ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154379 on commit e672c52b4f8b945a516f2eec006e33665a08f045
no osd crashes from teuthology-2014-03-28_02:30:01-rados-master-testing-basic-plana (which was complete, afaics), commit 7f4be9e9d040af443ca4cbe2f3fd2a44d01f71bf
2 osd crashes on the 27th, commit d9a2dea755a62e4f9fe0795410f37b68a15ae054, but neither are this.
Updated by Loïc Dachary about 10 years ago
/a/teuthology-2014-03-31_02:30:03-rados-master-testing-basic-plana/155875/teuthology.log is the only ec_pool test that failed and has no core
2014-03-31T17:51:23.798 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Total time run: 1803.051784 2014-03-31T17:51:23.798 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Total writes made: 19399 2014-03-31T17:51:23.798 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Write size: 4194304 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Bandwidth (MB/sec): 43.036 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Stddev Bandwidth: 31.3604 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Max bandwidth (MB/sec): 156 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Min bandwidth (MB/sec): 0 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Average Latency: 1.48695 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Stddev Latency: 1.78114 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Max latency: 58.8693 2014-03-31T17:51:23.799 INFO:teuthology.task.radosbench.radosbench.0.out:[10.214.133.33]: Min latency: 0.062374
Updated by Loïc Dachary about 10 years ago
Ran test for both gf-complete and jerasure and they are valgrind clean ( https://bitbucket.org/jimplank/jerasure/pull-request/16/run-tests-with-make-check https://bitbucket.org/jimplank/gf-complete/pull-request/11/add-make-check-target-and-basic-tests )
Updated by Loïc Dachary about 10 years ago
#0 0x00007f4756779b7b in raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42 #1 0x0000000000981b4e in reraise_fatal (signum=11) at global/signal_handler.cc:59 #2 handle_fatal_signal (signum=11) at global/signal_handler.cc:105 #3 <signal handler called> #4 0x0000000000000000 in ?? () #5 0x00007f47385ae6b1 in jerasure_matrix_dotprod (k=2, w=8, matrix_row=0x31513a8, src_ids=0x0, dest_id=<optimized out>, data_ptrs=0x7f4741ec7a00, coding_ptrs=0x7f4741ec7a10, size=2048) at erasure-code/jerasure/jerasure/src/jerasure.c:607 #6 0x00007f47385ae7d6 in jerasure_matrix_encode (k=2, m=1, w=8, matrix=<optimized out>, data_ptrs=0x7f4741ec7a00, coding_ptrs=0x7f4741ec7a10, size=2048) at erasure-code/jerasure/jerasure/src/jerasure.c:310 #7 0x00007f47385ce884 in ErasureCodeJerasureReedSolomonVandermonde::jerasure_encode (this=<optimized out>, data=<optimized out>, coding=<optimized out>, blocksize=<optimized out>) at erasure-code/jerasure/ErasureCodeJerasure.cc:230 #8 0x00007f47385d3806 in ErasureCodeJerasure::encode (this=0x34f4140, want_to_encode=..., in=..., encoded=0x7f4741ec7e50) at erasure-code/jerasure/ErasureCodeJerasure.cc:129 #9 0x000000000093d0f3 in ECUtil::encode (sinfo=..., ec_impl=..., in=..., want=..., out=0x7f4741ec80f0) at osd/ECUtil.cc:120 #10 0x000000000093b0c1 in TransGenerator::operator() (this=0x7f4741ec81c0, op=...) at osd/ECTransaction.cc:138 #11 0x0000000000939f6f in internal_visit<ECTransaction::AppendOp const> (operand=..., this=<synthetic pointer>) at /usr/include/boost/variant/variant.hpp:832 #12 visitation_impl_invoke_impl<boost::detail::variant::invoke_visitor<TransGenerator>, void const*, ECTransaction::AppendOp> (storage=<optimized out>, visitor=<synthetic pointer>) at /usr/include/boost/variant/detail/visitation_impl.hpp:130 #13 visitation_impl_invoke<boost::detail::variant::invoke_visitor<TransGenerator>, void const*, ECTransaction::AppendOp, boost::variant<ECTransaction::AppendOp, ECTransaction::CloneOp, ECTransaction::RenameOp, ECTransaction::StashOp, ECTransaction::TouchOp, ECTransaction::RemoveOp, ECTransaction::SetAttrsOp, ECTransaction::RmAttrOp, ECTransaction::AllocHintOp, ECTransaction::NoOp>::has_fallback_type_> (storage=<optimized out>, visitor=<synthetic pointer>, internal_which=<optimized out>, t=<optimized out>) at /usr/include/boost/variant/detail/visitation_impl.hpp:173 #14 visitation_impl<mpl_::int_<0>, boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<10l>, ECTransaction::AppendOp, boost::mpl::l_item<mpl_::long_<9l>, ECTransaction::CloneOp, boost::mpl::l_item<mpl_::long_<8l>, ECTransaction::RenameOp, boost::mpl::l_item<mpl_::long_<7l>, ECTransaction::StashOp, boost::mpl::l_item<mpl_::long_<6l>, ECTransaction::TouchOp, boost::mpl::l_item<mpl_::long_<5l>, ECTransaction::RemoveOp, boost::mpl::l_item<mpl_::long_<4l>, ECTransaction::SetAttrsOp, boost::mpl::l_item<mpl_::long_<3l>, ECTransaction::RmAttrOp, boost::mpl::l_item<mpl_::long_<2l>, ECTransaction::AllocHintOp, boost::mpl::l_item<mpl_::long_<1l>, ECTransaction::NoOp, boost::mpl::l_end> > > > > > > > > > >, boost::mpl::l_iter<boost::mpl::l_end> >, boost::detail::variant::invoke_visitor<TransGenerator>, void const*, boost::variant<ECTransaction::AppendOp, ECTransaction::CloneOp, ECTransaction::RenameOp, ECTransaction::StashOp, ECTransaction::TouchOp, ECTransaction::RemoveOp, ECTransaction::SetAttrsOp, ECTransaction::RmAttrOp, ECTransaction::AllocHintOp, ECTransaction::NoOp>::has_fallback_type_> (no_backup_flag=..., storage=<optimized out>, visitor=<synthetic pointer>, logical_which=<optimized out>, internal_which=<optimized out>) at /usr/include/boost/variant/detail/visitation_impl.hpp:260 #15 internal_apply_visitor_impl<boost::detail::variant::invoke_visitor<TransGenerator>, void const*> (storage=<optimized out>, logical_which=<optimized out>, visitor=<synthetic pointer>, internal_which=<optimized out>) at /usr/include/boost/variant/variant.hpp:1776 #16 internal_apply_visitor<boost::detail::variant::invoke_visitor<TransGenerator> > (visitor=<synthetic pointer>, this=0x30cc830) at /usr/include/boost/variant/variant.hpp:1798 #17 apply_visitor<TransGenerator> (this=0x30cc830, visitor=...) at /usr/include/boost/variant/variant.hpp:1820 #18 apply_visitor<TransGenerator, boost::variant<ECTransaction::AppendOp, ECTransaction::CloneOp, ECTransaction::RenameOp, ECTransaction::StashOp, ECTransaction::TouchOp, ECTransaction::RemoveOp, ECTransaction::SetAttrsOp, ECTransaction::RmAttrOp, ECTransaction::AllocHintOp, ECTransaction::NoOp> const> (visitable=..., visitor=...) at /usr/include/boost/variant/detail/apply_visitor_unary.hpp:60 #19 visit<TransGenerator> (vis=..., this=0x34198c0) at osd/ECTransaction.h:183 #20 ECTransaction::generate_transactions (this=0x34198c0, hash_infos=..., ecimpl=..., pgid=..., sinfo=..., transactions=0x7f4741ec8c10, temp_added=0x30c62e8, temp_removed=0x30c6318, out=0x0) at osd/ECTransaction.cc:282 #21 0x000000000091588f in ECBackend::start_write (this=0x3422b40, op=0x30c6228) at osd/ECBackend.cc:1494 #22 0x00000000009238cb in ECBackend::submit_transaction (this=0x3422b40, hoid=..., at_version=..., _t=<optimized out>, trim_to=..., log_entries=..., on_local_applied_sync=0x33b4300, on_all_applied=0x33cb860, on_all_commit=0x3132ee0, tid=2, reqid=..., client_op=...) at osd/ECBackend.cc:1260 #23 0x00000000007d48e7 in ReplicatedPG::issue_repop (this=0x35e0000, repop=0x3480300, now=...) at osd/ReplicatedPG.cc:6458 #24 0x000000000080f903 in ReplicatedPG::execute_ctx (this=0x35e0000, ctx=0x32bd000) at osd/ReplicatedPG.cc:1807 #25 0x000000000081adb2 in ReplicatedPG::do_op (this=0x35e0000, op=...) at osd/ReplicatedPG.cc:1464 #26 0x00000000007bbe22 in ReplicatedPG::do_request (this=0x35e0000, op=..., handle=...) at osd/ReplicatedPG.cc:1097 #27 0x0000000000619dfa in OSD::dequeue_op (this=0x3051000, pg=..., op=..., handle=...) at osd/OSD.cc:7633 #28 0x0000000000634e58 in OSD::OpWQ::_process (this=0x3051e28, pg=..., handle=...) at osd/OSD.cc:7603 #29 0x000000000067a48c in ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process ( this=0x3051e28, handle=...) at ./common/WorkQueue.h:190 #30 0x0000000000a53476 in ThreadPool::worker (this=0x3051470, wt=0x305bac0) at common/WorkQueue.cc:125 #31 0x0000000000a55280 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:317 #32 0x00007f4756771e9a in start_thread (arg=0x7f4741ecb700) at pthread_create.c:308 #33 0x00007f4754d323fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #34 0x0000000000000000 in ?? ()
Updated by Sage Weil about 10 years ago
mira087 has the fisrst 2 failures, mira089 has the earlier one. they all see to crash in teh same place:
#4 0x0000000000000000 in ?? () #5 0x00007f47385ae6b1 in jerasure_matrix_dotprod (k=2, w=8, matrix_row=0x31513a8, src_ids=0x0, dest_id=<optimized out>, data_ptrs=0x7f4741ec7a00, coding_ptrs=0x7f4741ec7a10, size=2048) at erasure-code/jerasure/jerasure/src/jerasure.c:607 #6 0x00007f47385ae7d6 in jerasure_matrix_encode (k=2, m=1, w=8, matrix=<optimized out>, data_ptrs=0x7f4741ec7a00, coding_ptrs=0x7f4741ec7a10, size=2048) at erasure-code/jerasure/jerasure/src/jerasure.c:310
Updated by Loïc Dachary about 10 years ago
(gdb) frame 5 (gdb) p *(gf_internal_t*)gfp_array[32].scratch $2 = { mult_type = 0x0, region_type = 0x0, divide_type = 0x0, w = 0x20, prim_poly = 0x400007, free_me = 0x1, arg1 = 0x0, arg2 = 0x0, base_gf = 0x0, private = 0x33b92b8 }
Updated by Loïc Dachary about 10 years ago
Compiled TestErasureCodeJerasure.cc with m=1 https://github.com/ceph/ceph/blob/master/src/test/erasure-code/TestErasureCodeJerasure.cc#L48 and all optimizations. Break where the code path is supposed to lead to verify it does. This is the same configuration as in the core. The ${INTEL_FLAGS} have to be added to https://github.com/ceph/ceph/blob/master/src/test/erasure-code/Makefile.am#L89 and https://github.com/ceph/ceph/blob/master/src/test/erasure-code/Makefile.am#L92 to activate the opimizations.
(gdb) bt #0 gf_multby_one (src=0x1ce50a0, dest=0x1ce5140, bytes=160, xor=1) at erasure-code/jerasure/gf-complete/src/gf.c:921 #1 0x0000000000877c12 in gf_w32_split_4_32_lazy_sse_multiply_region (gf=0x1cbc370, src=0x1ce50a0, dest=0x1ce5140, val=1, bytes=160, xor=1) at erasure-code/jerasure/gf-complete/src/gf_w32.c:1895 #2 0x000000000084fb38 in galois_w32_region_xor (src=0x1ce50a0, dest=0x1ce5140, nbytes=160) at erasure-code/jerasure/jerasure/src/galois.c:332 #3 0x000000000084fb69 in galois_region_xor (src=0x1ce50a0 "klmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", dest=0x1ce5140 "*..**66::>>::66*acagacaoac", ' ' <repeats 26 times>, "QSQWQSQ_QS*.CDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij", nbytes=160) at erasure-code/jerasure/jerasure/src/galois.c:338 #4 0x00000000008518cc in jerasure_matrix_dotprod (k=2, w=8, matrix_row=0x1caa148, src_ids=0x0, dest_id=2, data_ptrs=0x7fffffffdb40, coding_ptrs=0x7fffffffdb50, size=160) at erasure-code/jerasure/jerasure/src/jerasure.c:607 #5 0x0000000000850767 in jerasure_matrix_encode (k=2, m=1, w=8, matrix=0x1caa148, data_ptrs=0x7fffffffdb40, coding_ptrs=0x7fffffffdb50, size=160) at erasure-code/jerasure/jerasure/src/jerasure.c:310 #6 0x00000000008b793b in ErasureCodeJerasureReedSolomonVandermonde::jerasure_encode (this=0x7fffffffe080, data=0x7fffffffdb40, coding=0x7fffffffdb50, blocksize=160) at erasure-code/jerasure/ErasureCodeJerasure.cc:230 #7 0x00000000008b6aad in ErasureCodeJerasure::encode (this=0x7fffffffe080, want_to_encode=..., in=..., encoded=0x7fffffffdfc0) at erasure-code/jerasure/ErasureCodeJerasure.cc:129 #8 0x000000000084bf1a in ErasureCodeTest_encode_decode_Test<ErasureCodeJerasureReedSolomonVandermonde>::TestBody (this=0x1cae0d0) at test/erasure-code/TestErasureCodeJerasure.cc:62 #9 0x00000000008bfec6 in testing::Test::Run (this=0x1cae0d0) at ./src/gtest.cc:2095 #10 0x00000000008c0432 in testing::internal::TestInfoImpl::Run (this=0x1cc6000) at ./src/gtest.cc:2314 #11 0x00000000008c0945 in testing::TestCase::Run (this=0x1cce000) at ./src/gtest.cc:2420 #12 0x00000000008c4ee6 in testing::internal::UnitTestImpl::RunAllTests (this=0x1cca000) at ./src/gtest.cc:4024 #13 0x00000000008c3e32 in testing::UnitTest::Run (this=0xd02c20 <testing::UnitTest::GetInstance()::instance>) at ./src/gtest.cc:3687 #14 0x000000000083fd90 in main (argc=1, argv=0x7fffffffe3a8) at test/erasure-code/TestErasureCodeJerasure.cc:344
Updated by Loïc Dachary about 10 years ago
To find the failed ones:
grep -l ec_pool /a/teuthology-2014-04-02_*rados-master*/*/config.yaml | xargs grep -l radosbench | sed -e 's/config.yaml/teuthology.log/' > /tmp/files tail -n1 $(cat /tmp/files)
/a/teuthology-2014-04-02_02:30:02-rados-master-testing-basic-plana has no core
Updated by Sage Weil about 10 years ago
This didn't trigger at all in yesterday's run. :/
Updated by Sage Weil about 10 years ago
ubuntu@teuthology:/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/170763
Updated by Loïc Dachary about 10 years ago
/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/170763/remote/ubuntu@plana08.front.sepia.ceph.com/log/ceph-osd.0.log.gz shows:
*** Caught signal (Illegal instruction) ** in thread 7fc8de7a1700 ceph version 0.78-568-g232ac1a (232ac1a52a322d163d8d8dbc4a7da4b6a9acb709) 1: ceph-osd() [0x982c2a] 2: (()+0x364a0) [0x7fc8f1d4b4a0] 3: [0x1901a20]
Updated by Loïc Dachary about 10 years ago
SSE4 plugin selected and ECX = 029ee3ff (this is consistent)
ubuntu@plana08:~$ cpuid eax in eax ebx ecx edx 00000000 0000000b 756e6547 6c65746e 49656e69 00000001 000206c2 21200800 029ee3ff bfebfbff 00000002 55035a01 00f0b2ff 00000000 00ca0000 00000003 00000000 00000000 00000000 00000000 00000004 3c004122 00c0003f 0000007f 00000000 00000005 00000040 00000040 00000003 00001120 00000006 00000007 00000002 00000001 00000000 00000007 00000000 00000000 00000000 00000000 00000008 00000000 00000000 00000000 00000000 00000009 00000000 00000000 00000000 00000000 0000000a 07300403 00000004 00000000 00000603 0000000b 00000005 00000008 00000201 00000021 80000000 80000008 00000000 00000000 00000000 80000001 00000000 00000000 00000001 2c100800 80000002 65746e49 2952286c 6f655820 2952286e 80000003 55504320 20202020 20202020 45202020 80000004 30323635 20402020 30342e32 007a4847 80000005 00000000 00000000 00000000 00000000 80000006 00000000 00000000 01006040 00000000 80000007 00000000 00000000 00000000 00000100 80000008 00003028 00000000 00000000 00000000 Vendor ID: "GenuineIntel"; CPUID level 11 Intel-specific functions: Version 000206c2: Type 0 - Original OEM Family 6 - Pentium Pro Model 12 - Stepping 2 Reserved 8 Extended brand string: "Intel(R) Xeon(R) CPU E5620 @ 2.40GHz" CLFLUSH instruction cache line size: 8 Initial APIC ID: 33 Hyper threading siblings: 32 Feature flags bfebfbff: FPU Floating Point Unit VME Virtual 8086 Mode Enhancements DE Debugging Extensions PSE Page Size Extensions TSC Time Stamp Counter MSR Model Specific Registers PAE Physical Address Extension MCE Machine Check Exception CX8 COMPXCHG8B Instruction APIC On-chip Advanced Programmable Interrupt Controller present and enabled SEP Fast System Call MTRR Memory Type Range Registers PGE PTE Global Flag MCA Machine Check Architecture CMOV Conditional Move and Compare Instructions FGPAT Page Attribute Table PSE-36 36-bit Page Size Extension CLFSH CFLUSH instruction DS Debug store ACPI Thermal Monitor and Clock Ctrl MMX MMX instruction set FXSR Fast FP/MMX Streaming SIMD Extensions save/restore SSE Streaming SIMD Extensions instruction set SSE2 SSE2 extensions SS Self Snoop HT Hyper Threading TM Thermal monitor 31 reserved TLB and cache info: 5a: unknown TLB/cache descriptor 03: Data TLB: 4KB pages, 4-way set assoc, 64 entries 55: unknown TLB/cache descriptor ff: unknown TLB/cache descriptor b2: unknown TLB/cache descriptor f0: unknown TLB/cache descriptor ca: unknown TLB/cache descriptor Processor serial: 0002-06C2-0000-0000-0000-0000
Updated by Loïc Dachary about 10 years ago
- File ceph-osd.0.log.gz ceph-osd.0.log.gz added
The plana08 machine has been recycled and there is no core archived. See the ceph-osd.0.log.gz for the full logs. The commit ( 232ac1a52a322d163d8d8dbc4a7da4b6a9acb709 ) on which it runs is roughly firefly up to date (module 24h).
Updated by Loïc Dachary about 10 years ago
The teuthology config.yaml
description: rados/thrash/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml thrashers/default.yaml workloads/ec-radosbench.yaml} last_in_suite: false nuke-on-error: true os_type: ubuntu overrides: admin_socket: branch: wip-7994 ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 osd op thread timeout: 60 osd sloppy crc: true fs: btrfs log-whitelist: - slow request sha1: 232ac1a52a322d163d8d8dbc4a7da4b6a9acb709 ceph-deploy: branch: dev: wip-7994 conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: 232ac1a52a322d163d8d8dbc4a7da4b6a9acb709 s3tests: branch: master workunit: sha1: 232ac1a52a322d163d8d8dbc4a7da4b6a9acb709 roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - client.0 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.1 tasks: - chef: null - clock.check: null - install: null - ceph: conf: osd: osd debug reject backfill probability: 0.3 log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - radosbench: clients: - client.0 ec_pool: true time: 1800 unique_pool: true teuthology_branch: master verbose: true
Updated by Loïc Dachary about 10 years ago
The workload is slightly different from the previous one:
--- c.yaml 2014-04-06 12:02:28.606673232 +0200 +++ config.yaml 2014-04-06 12:02:29.082677334 +0200 @@ -17,13 +17,15 @@ debug journal: 20 debug ms: 1 debug osd: 20 - fs: ext4 + osd op thread timeout: 60 + osd sloppy crc: true + fs: btrfs log-whitelist: - slow request @@ -57,11 +59,14 @@ - clock.check: null - install: null - ceph: + conf: + osd: + osd debug reject backfill probability: 0.3 log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: - chance_pgnum_grow: 2 + chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - radosbench:
running 30 of them hoping it has a higher probability to trigger the problem.
Updated by Loïc Dachary about 10 years ago
Ran 27 times without triggering the problem. Will keep running it 100 times more.
Updated by Sage Weil about 10 years ago
0> 2014-04-07 01:31:19.678469 7ff4d4cc5700 -1 *** Caught signal (Segmentation fault) ** in thread 7ff4d4cc5700 ceph version 0.78-600-g19f50b9 (19f50b9d7bbbb2cce3b599f3ed8a9fa32c3d4e53) 1: ceph-osd() [0x984aea] 2: (()+0xfcb0) [0x7ff4e9a7bcb0] 3: (galois_w32_region_xor()+0x4a) [0x7ff4db4fc2ea] 4: (jerasure_matrix_dotprod()+0x1d1) [0x7ff4db4fd6b1] 5: (jerasure_matrix_decode()+0x237) [0x7ff4db4fda57] 6: (ErasureCodeJerasureReedSolomonVandermonde::jerasure_decode(int*, char**, char**, int)+0x36) [0x7ff4db51d826] 7: (ErasureCodeJerasure::decode(std::set<int, std::less<int>, std::allocator<int> > const&, std::map<int, ceph::buffer::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::list> > > const&, std::map<int, ceph::buffer::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::list> > >*)+0x1369) [0x7ff4db521d69] 8: (ECUtil::decode(ECUtil::stripe_info_t const&, std::tr1::shared_ptr<ceph::ErasureCodeInterface>&, std::map<int, ceph::buffer::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::list> > >&, std::map<int, ceph::buffer::list*, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::li st*> > >&)+0x534) [0x940de4] 9: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::nu ll_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, boost::optional<std::map<std::string, ceph::buffer::list, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::list> > > >, RecoveryMessages*)+0x78e) [0x91fb1e] 10: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x86) [0x933e86] 11: (GenContext<std::pair<RecoveryMessages*, ECBackend::read_result_t&>&>::complete(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x9) [0x926ff9] 12: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x5b) [0x916a2b] 13: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0xcd5) [0x920f15] 14: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x506) [0x924c16] 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1ee) [0x7bc5ae] 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34a) [0x61a1aa] 17: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x628) [0x635288] 18: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x67a88c] 19: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa56db6] 20: (ThreadPool::WorkThread::entry()+0x10) [0xa58bc0] 21: (()+0x7e9a) [0x7ff4e9a73e9a] 22: (clone()+0x6d) [0x7ff4e80343fd]ubuntu@teuthology:/var/lib/teuthworker/archive/samuelj-2014-04-06_19:34:24-rados:thrash-wip-sam-testing-testing-basic-plana/175182
I didn't see the 'encode adjusted buffer length' this time, btw.
Updated by Loïc Dachary about 10 years ago
Interesting : it is in decode this time, not encode
Updated by Loïc Dachary about 10 years ago
gfp_array is a global variable . If galois_w16_region_xor is called from two different threads, there is a race condition.
Updated by Loïc Dachary about 10 years ago
Kevin Greenan writes:
Using the doc and a few special functions in jerasure, you can ensure that the underlying GF implementations are thread safe. This can be done in jerasure using galois_init_field() and galois_change_technique(). If these are explicitly used before starting any encoding/decoding (i.e. during init), there should never be a race condition.
http://web.eecs.utk.edu/~plank/plank/papers/GF-Complete-Manual-1.02.pdf
Updated by Loïc Dachary about 10 years ago
- % Done changed from 10 to 50
Updated by Loïc Dachary about 10 years ago
- Status changed from In Progress to Fix Under Review
Updated by Loïc Dachary about 10 years ago
running tests against the proposed change, using the previous config.yaml. Not really hoping that it will failing. Just hoping that it won't fail ;-)
description: rados/thrash/{clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml thrashers/default.yaml workloads/ec-radosbench.yaml} last_in_suite: false nuke-on-error: true os_type: ubuntu overrides: ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 osd op thread timeout: 60 osd sloppy crc: true fs: btrfs log-whitelist: - slow request sha1: 2ba00bd2e8d9919cefb069cfd9dd7bdd3117e8b1 ceph-deploy: branch: dev: wip-7914 conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: 2ba00bd2e8d9919cefb069cfd9dd7bdd3117e8b1 s3tests: branch: master workunit: sha1: 2ba00bd2e8d9919cefb069cfd9dd7bdd3117e8b1 roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - client.0 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.1 tasks: - chef: null - clock.check: null - install: null - ceph: conf: osd: osd debug reject backfill probability: 0.3 log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - radosbench: clients: - client.0 ec_pool: true time: 1800 unique_pool: true teuthology_branch: master verbose: true
Updated by Loïc Dachary about 10 years ago
the above config.yaml ran 6 times without failing
Updated by Sage Weil about 10 years ago
- Status changed from Fix Under Review to 7
Updated by Loïc Dachary about 10 years ago
Kevin Greenan proposes to expose a static function to reduce the amount of code required from the plugin : https://bitbucket.org/jimplank/jerasure/pull-request/17/make-galois_init_default_field-int-w
Updated by Loïc Dachary about 10 years ago
- % Done changed from 50 to 90
The above workload ran 26 times without crashing. rados suite did not run yet on the latest master.
Updated by Loïc Dachary about 10 years ago
Updated by Loïc Dachary about 10 years ago
The above workload ran 51 times without crashing.
Updated by Loïc Dachary about 10 years ago
The above workload ran 82 times without crashing. The loop stopped because a DNS resolution failure made it impossible to retrieve the Ceph packages. The loop was restarted.
Updated by Loïc Dachary about 10 years ago
All changes have been accepted upstream ( jerasure / gf-complete ), backported and copied into http://github.com/ceph/{jerasure,gf-complete}
- jerasure v2-ceph branch is now exactly the same as the stable upstream
- gf-complete v1-ceph branch is now exactly the same as the stable upstream
Updated by Sage Weil about 10 years ago
we haven't seen this in several days; i'm ready to call it resolved!
Updated by Loïc Dachary about 10 years ago
- Status changed from 7 to Resolved
- % Done changed from 90 to 100
The workload ran 30 more times without errors. The loop was interrupted and the plana machines released.