Actions
Bug #2999
closedosd: msgr crash in OSD::complete_notify
Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Logs: ubuntu@teuthology:/a/teuthology-2012-08-17_19:00:07-regression-master-testing-gcov/3549
0> 2012-08-18 00:09:48.191296 7fbbd527b700 -1 *** Caught signal (Segmentation fault) ** in thread 7fbbd527b700 ceph version 0.50-267-gae57db0 (commit:ae57db03a9287257f3034fb045c30d5b7edff468) 1: /tmp/cephtest/binary/usr/local/bin/ceph-osd() [0x81e0da] 2: (()+0xfcb0) [0x7fbbe630acb0] 3: (SimpleMessenger::_send_message(Message*, Connection*, bool)+0x264) [0x8e0a64] 4: (SimpleMessenger::send_message(Message*, Connection*)+0x13) [0x8e5173] 5: (OSD::complete_notify(void*, void*)+0x14d) [0x61802d] 6: (OSD::ack_notification(entity_name_t&, void*, void*, ReplicatedPG*)+0x76) [0x6183f6] 7: (ReplicatedPG::do_osd_op_effects(ReplicatedPG::OpContext*)+0x247e) [0x579d4e] 8: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x6e3) [0x5b1f53] 9: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x4650) [0x5b8a60] 10: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x484) [0x6f4474] 11: (OSD::dequeue_op(PG*)+0x304) [0x611aa4] 12: (OSD::OpWQ::_process(PG*)+0x15) [0x677ef5] 13: (ThreadPool::WorkQueue<PG>::_void_process(void*)+0x12) [0x66e4e2] 14: (ThreadPool::worker()+0x4db) [0x8f376b] 15: (ThreadPool::WorkThread::entry()+0x15) [0x66fa15] 16: (Thread::_entry_func(void*)+0x12) [0x8e6212] 17: (()+0x7e9a) [0x7fbbe6302e9a] 18: (clone()+0x6d) [0x7fbbe46a64bd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. ubuntu@teuthology:/a/teuthology-2012-08-17_19:00:07-regression-master-testing-gcov/3549$ cat config.yaml kernel: &id001 kdb: true sha1: 1fe5e9932156f6122c3b1ff6ba7541c27c86718c nuke-on-error: true overrides: ceph: conf: client: rbd cache: true rbd cache max dirty: 0 coverage: true fs: ext4 log-whitelist: - slow request sha1: ae57db03a9287257f3034fb045c30d5b7edff468 workunit: sha1: ae57db03a9287257f3034fb045c30d5b7edff468 roles: - - mon.a - osd.0 - osd.1 - osd.2 - - mds.a - osd.3 - osd.4 - osd.5 - - client.0 targets: ubuntu@plana48.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC1ScpN+amKWlGYSJOt8sqzLTCXYaU5QTqB3bbss463/VRcDQwabXs6tib54CO6FzWaloSqMaPZchA4HeNXOwdQ9daBnG1b0vrqs0B5jZnCVzvK9AoWGDg0tm68PWYr1AtJcNyIsutywVRdzjA2nzSioKU59dKugVog/+pkoB0hGYvXo72pePMV00IrgMr9FSbDnxi3L9iJvi2LD0Pnecx6DMnaDob/T+X5y0piap5esjwIIq7wqvXuEJE9jdmxPfHo3ise2j9UA2SGI5b7HL3YOemo0zic7ukCMvlc8Ag5dQnjANTcj2eUJyagvzJgxGqhoxyAp/WpmaHZkvf0RasB ubuntu@plana62.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC+nI5/l38Kdw2W/qbEKrVMcnVdIxJG7hNnD7nnS3+Zx/uPiWrds26ZPrM5IY7D8Mf7sjBzUYbqsX9xGYMLLTQaeDwsZn/7RjjSg8zOS1aMP5F/AJzSQx4Nt37eLUsRHX3yA30/OQcl6sBgDjHyhSPcSuHWSnMmoy4pkDo3xpQMQMtxDG8gWq+to1hZwJbsiK9FdutEgPJg3inWM1WVc5L6NmRN2WQNEGT8HvtlBCWqX6/H/hLujQlbgyJAbeG4BriMV3gCIccJE833f/fN9KIzaMlD7qHTgWcaGk+LY84nUdNlTkNoX+L4m6WRY8/Pt9om2dOocsXyCwYLIS4heIDT ubuntu@plana65.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDTwfkF9asvpySXF/DOk10UkRDNtRwgGgLww/I/3E2r+JpsfYtW62TA1HMXjtB1g7SrcIolqCiiMd+5MIURIND94n76JiZ2o4DplLKIqUB6ys46gro7mwoeFnZNOuwdAA5bO4dfgeQ3yPtfIqpWTejkCB7ai/kG04C4ekz6EgplwtqWIfvXnij4fNaqvm3s/IxGhnO40DOGNwsAldEJo2fuJN8KHnYzsU/Dx5kJ85jQl2eQJI74VpMoh2Ge7+n9Q8rJhegfcHYPLJsX/Uyrf7Rtk1RfeTyZbIOSJIQDbQNepu278kvc9IEnFg3WfvWespfrUExVgnXq53xd1RIFcx6L tasks: - internal.lock_machines: 3 - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.syslog: null - internal.timer: null - chef: null - clock: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: timeout: 1200 - rbd_fsx: clients: - client.0 ops: 20000 ubuntu@teuthology:/a/teuthology-2012-08-17_19:00:07-regression-master-testing-gcov/3549$ cat summary.yaml ceph-sha1: ae57db03a9287257f3034fb045c30d5b7edff468 client.0-kernel-sha1: 1fe5e9932156f6122c3b1ff6ba7541c27c86718c description: collection:rbd-thrash clusters:6-osd-3-machine.yaml fs:ext4.yaml thrashers:default.yaml workloads:rbd_fsx_cache_writethrough.yaml duration: 7051.4660429954529 failure_reason: 'Command failed with status 1: ''/tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/daemon-helper term /tmp/cephtest/binary/usr/local/bin/ceph-osd -f -i 2 -c /tmp/cephtest/ceph.conf''' flavor: gcov mds.a-kernel-sha1: 1fe5e9932156f6122c3b1ff6ba7541c27c86718c mon.a-kernel-sha1: 1fe5e9932156f6122c3b1ff6ba7541c27c86718c owner: scheduled_teuthology@teuthology success: false
Updated by Sage Weil over 11 years ago
- Subject changed from osd crash to osd: msgr crash in OSD::complete_notify
- Priority changed from Normal to High
Actions