Project

General

Profile

Bug #2820

osd: crash in handle_osd_ping

Added by Sage Weil over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description


     0> 2012-07-20 18:32:54.154595 7fa60cfd0700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fa60cfd0700

 ceph version 0.48argonaut-477-ge84486d (commit:e84486d6962ae2b7604f2c36196dab190699b495)
 1: /tmp/cephtest/binary/usr/local/bin/ceph-osd() [0x71c49a]
 2: (()+0xfcb0) [0x7fa61bd49cb0]
 3: (OSD::handle_osd_ping(MOSDPing*)+0x73d) [0x5dc41d]
 4: (OSD::heartbeat_dispatch(Message*)+0x22b) [0x5dcd3b]
 5: (DispatchQueue::entry()+0x6b1) [0x84f901]
 6: (DispatchQueue::DispatchThread::entry()+0xd) [0x7ae87d]
 7: (()+0x7e9a) [0x7fa61bd41e9a]
 8: (clone()+0x6d) [0x7fa61a2f64bd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

this was with socket failure injection on my testing branch.

ubuntu@teuthology:/a/sage-2012-07-20_18:17:10-regression-wip-msgr-cleanup-testing-basic/14856$   cat config.yaml 
kernel: &id001
  kdb: true
  sha1: 77dca1ac33894de22b1740bb9cf6b8ef6429c700
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 5000
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: e84486d6962ae2b7604f2c36196dab190699b495
  workunit:
    sha1: e84486d6962ae2b7604f2c36196dab190699b495
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana86.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDUMsCERn/r+eXKzlu9W/n6BFak26ReRaNUaccpQTrqOqhm3M7LiIGAeBo6JsygU9Kdtsm3115P4odiDbQLuNm0gKPw5DY7zDVqV+YRqe3kOrgIL/rxs6l6Y7htSRvzGhz7RsaG1fQH/BuaD18+s2WguKWwgRuuU1bvRYEHu0Y7qYYqUy+xJd4CGX+LMuAJzDYj5R7xIEYlJBln/c47He8q53cUUw5w48Hm47O8xo9ov7CqHCpOTZixeseY+zEm9skoLsUpBDIOmT+xUh7sOyKFhj+CnETpv3DocxGtoUgkx42GtkDhFK/dVV95Q6EOBseyYIUvFoHBKB2WV32xyxB5
  ubuntu@plana87.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDLSD++iJuPg8Qqyu57Q8iCNDXVnDtN2jhfRu7X8IwluVNmlpZCFiPJ0EFu+qQmBlJ8cmcQ+GXfdmrMOivFkXTzPOM1DeFxhwZTMr5VB4wInOgxAlCSJBGFQ4jxU98K0bTutd2IVvzUknB0oaMWxOn+4fxowFimaWe1gdx0v8Anarvo2SoqXMzZ4t8OKkGAUf9J7PPSmjWGMxH0fpQK7U+K/RuonMT3LGioi6bE9hbMPkk8iB/gpxmWma1UpIpMeIq7VDyp3IEe7U+CRdihLGjv722m0v9ul6lNOzwc1RBo6BexF6zVyepkOuu9gzrHRF80cnAkM0konp6RVAZkoyOX
  ubuntu@plana89.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC8bUAPl49ehcm4WuMAuuopncl9r8nQQEslbavP46ezYvI/hVQKSERVod7L7iK3pZ9iqQUyJUCkuFIR4/b2EQ8hQ+scv12K4n45DHTS8jHOCablg3vGkLPJmvKKedVYL5KPhCj6je3f0lVc+TXOGkJJvopDwhHnrh2TmRkaGOA7svsxknvT3Qx683XlNbi3S64YmU0us7IOCbXwZTEdZphzoqHHWLe0ZKkLNL72eDujtFi6LHspUoEmHBk46RrsGbhYFwlYTq47NGbj5g5ix/iS6C7bRNNCRD9GWciKqpAvZk88Zg/LbCzsV8DK4shnJuNqhBJPgKflUuqpYVZ8SuCJ
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph: null
- ceph-fuse: null
- workunit:
    clients:
      all:
      - rados/load-gen-mostlyread.sh

gdb strangly puts us at OSD.cc:5426 (end of file) but says handle_osd_ping().  :/

History

#1 Updated by Sage Weil over 11 years ago

also ubuntu@teuthology:/a/sage-2012-07-20_18:17:10-regression-wip-msgr-cleanup-testing-basic/14885

- chef: null
- clock: null
- ceph: null
- qemu:
all:
test: https://raw.github.com/ceph/ceph/master/qa/workunits/suites/bonnie.sh

#2 Updated by Sage Weil over 11 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF