Project

General

Profile

Bug #11914

"Caught signal" in teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi run

Added by Yuri Weinstein almost 9 years ago. Updated over 8 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
David Zafman
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is part of firefly v0.80.10 validation.

NOTE: this test passed on rerun (http://pulpito.ceph.com/teuthology-2015-06-06_08:25:53-rados-firefly---basic-multi/)

Run: http://pulpito.ceph.com/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/
Job: 919773
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/

/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920325613-2015-06-04 18:52:51.123994 7f7f51ed1700 10 journal room 93831167 max_size 104857600 pos 57552896 header.start 46530560 top 4096
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920325741-2015-06-04 18:52:51.123998 7f7f51ed1700 10 journal check_for_full at 57552896 : 4096 < 93831167
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920325837-2015-06-04 18:52:51.124000 7f7f51ed1700 15 journal prepare_single_write 1 will write 57552896 : seq 53747 len 3508 -> 4096 (head 40 pre_pad 0 ebl 3508 post_pad 508 tail 40) (ebl alignment -1)
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326029-2015-06-04 18:52:51.124013 7f7f51ed1700 20 journal prepare_multi_write queue_pos now 57556992
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326123-2015-06-04 18:52:51.124015 7f7f51ed1700 15 journal do_write writing 57552896~4096
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326205-2015-06-04 18:52:51.132097 7f7f46784700 -1 *** Caught signal (Aborted) **
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326279- in thread 7f7f46784700
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326303-
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326304: ceph version 0.80.9-255-gd0f9c5f (d0f9c5f47024f53b4eccea2e0fde9b7844746362)
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326381- 1: ceph-osd() [0x98376f]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326407- 2: (()+0x10340) [0x7f7f5c06a340]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326441- 3: (gsignal()+0x39) [0x7f7f5a711bb9]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326479- 4: (abort()+0x148) [0x7f7f5a714fc8]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326516- 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f7f5b01d6b5]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326586- 6: (()+0x5e836) [0x7f7f5b01b836]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326620- 7: (()+0x5e863) [0x7f7f5b01b863]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326654- 8: (()+0x5eaa2) [0x7f7f5b01baa2]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326688- 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1f2) [0xa67222]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326780- 10: (PG::update_snap_map(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, ObjectStore::Transaction&)+0x434) [0x738cd4]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920326914- 11: (PG::append_log(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, eversion_t, eversion_t, ObjectStore::Transaction&, bool)+0x55b) [0x73f3fb]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920327073- 12: (ReplicatedPG::log_operation(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, boost::optional<pg_hit_set_history_t>&, eversion_t const&, eversion_t const&, bool, ObjectStore::Transaction*)+0xba) [0x81e91a]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920327298- 13: (ECBackend::handle_sub_write(pg_shard_t, std::tr1::shared_ptr<OpRequest>, ECSubWrite&, Context*)+0x407) [0x9149f7]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920327418- 14: (ECBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x2ca) [0x920a4a]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920327501- 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2db) [0x7a2bcb]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920327606- 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x371) [0x5ffcb1]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920327728- 17: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x1f4) [0x61b0b4]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920327821- 18: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x65dd0c]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920328012- 19: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb01) [0xa580d1]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920328080- 20: (ThreadPool::WorkThread::entry()+0x10) [0xa58fc0]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920328135- 21: (()+0x8182) [0x7f7f5c062182]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920328169- 22: (clone()+0x6d) [0x7f7f5a7d5fbd]
/a/teuthology-2015-06-04_11:23:35-rados-firefly---basic-multi/919773/remote/plana29/log/ceph-osd.4.log.gz:920328206- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

History

#1 Updated by Samuel Just almost 9 years ago

  • Assignee set to David Zafman
  • Priority changed from Normal to Urgent

A bit before the crash, ceph_objectstore_tool import 3.31s2 into osd 4.

2015-06-04T18:51:49.005 INFO:teuthology.orchestra.run.plana29.stdout:Remove successful
2015-06-04T18:51:49.083 INFO:teuthology.orchestra.run.plana29:Running: 'sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op list-pgs'
2015-06-04T18:51:50.009 INFO:tasks.rados.rados.0.burnupi52.stdout:2219: done (8 left)
2015-06-04T18:51:50.009 INFO:tasks.rados.rados.0.burnupi52.stdout:2220: copy_from oid 38 from oid 49 current snap is 258
2015-06-04T18:51:50.009 INFO:tasks.rados.rados.0.burnupi52.stdout:append oid 21 current snap is 258
2015-06-04T18:51:50.009 INFO:tasks.rados.rados.0.burnupi52.stdout:2221: seq_num 732 ranges {0=655360,655360=679936,1335296=712704,2048000=212992}
2015-06-04T18:51:50.030 INFO:tasks.rados.rados.0.burnupi52.stdout:2221: writing burnupi5261987-21 from 0 to 655360 tid 1
2015-06-04T18:51:50.052 INFO:tasks.rados.rados.0.burnupi52.stdout:2221: writing burnupi5261987-21 from 655360 to 1335296 tid 2
2015-06-04T18:51:50.077 INFO:tasks.rados.rados.0.burnupi52.stdout:2221: writing burnupi5261987-21 from 1335296 to 2048000 tid 3
2015-06-04T18:51:50.084 INFO:tasks.rados.rados.0.burnupi52.stdout:2221: writing burnupi5261987-21 from 2048000 to 2260992 tid 4
2015-06-04T18:51:50.085 INFO:tasks.rados.rados.0.burnupi52.stdout:append oid 37 current snap is 258
2015-06-04T18:51:50.085 INFO:tasks.rados.rados.0.burnupi52.stdout:2222: seq_num 733 ranges {3133440=528384,3661824=614400,4276224=786432,5062656=204800}
2015-06-04T18:51:50.126 INFO:tasks.rados.rados.0.burnupi52.stdout:2222: writing burnupi5261987-37 from 3133440 to 3661824 tid 1
2015-06-04T18:51:50.147 INFO:tasks.rados.rados.0.burnupi52.stdout:2222: writing burnupi5261987-37 from 3661824 to 4276224 tid 2
2015-06-04T18:51:50.177 INFO:tasks.rados.rados.0.burnupi52.stdout:2222: writing burnupi5261987-37 from 4276224 to 5062656 tid 3
2015-06-04T18:51:50.184 INFO:tasks.rados.rados.0.burnupi52.stdout:2222: writing burnupi5261987-37 from 5062656 to 5267456 tid 4
2015-06-04T18:51:50.185 INFO:tasks.rados.rados.0.burnupi52.stdout:2223: read oid 30 snap -1
2015-06-04T18:51:50.185 INFO:tasks.rados.rados.0.burnupi52.stdout:2224: delete oid 50 current snap is 258
2015-06-04T18:51:51.239 INFO:tasks.thrashosds.thrasher:Moving pg 3.31s2 from osd.5 to osd.4
2015-06-04T18:51:51.240 INFO:teuthology.orchestra.run.plana29:Running: 'sudo ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --log-file=/var/log/ceph/objectstore_tool.\\$pid.log --op import --file /home/ubuntu/cephtest/data/exp.3.31s2.5'
2015-06-04T18:51:51.414 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.11s1_TEMP clearing temp
2015-06-04T18:51:51.434 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.7s0_TEMP clearing temp
2015-06-04T18:51:51.442 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.2cs1_TEMP clearing temp
2015-06-04T18:51:51.451 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.9s0_TEMP clearing temp
2015-06-04T18:51:51.459 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.40s0_TEMP clearing temp
2015-06-04T18:51:51.467 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.32s2_TEMP clearing temp
2015-06-04T18:51:51.476 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.30s1_TEMP clearing temp
2015-06-04T18:51:51.484 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.1s0_TEMP clearing temp
2015-06-04T18:51:51.492 INFO:teuthology.orchestra.run.plana29.stdout:finish_remove_pgs 3.5s2_TEMP clearing temp

#2 Updated by David Zafman almost 9 years ago

  • Priority changed from Urgent to High

I dropped the priority because it didn't happen on a second run. This is in firefly and may have been fixed in newer releases.

I haven't determined what caused this assert from my examination of the logs.

#3 Updated by David Zafman almost 9 years ago

  • Affected Versions v0.80.10 added
  • Affected Versions deleted (0.80)

#4 Updated by Samuel Just over 8 years ago

  • Status changed from New to Can't reproduce

Also available in: Atom PDF