Actions
Bug #5303
closedOSD segfaults on SIGINT
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
4 - irritation
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This is not the first time but interrupting the OSD with SIGINT (CTRL+C) causes a segmentation fault.
Cuttlefish 0.61.3
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffeeffd700 (LWP 4205)] 0x00000000006fb348 in exists (osd=4, this=0x0) at ./osd/OSDMap.h:301 301 return osd >= 0 && osd < max_osd && (osd_state[osd] & CEPH_OSD_EXISTS); (gdb) bt #0 0x00000000006fb348 in exists (osd=4, this=0x0) at ./osd/OSDMap.h:301 #1 is_up (osd=4, this=0x0) at ./osd/OSDMap.h:305 #2 OSDMap::get_inst (this=0x0, osd=4) at ./osd/OSDMap.h:351 #3 0x00000000006b238b in OSDService::prepare_to_stop (this=this@entry=0x14d57b8) at osd/OSD.cc:4087 #4 0x00000000006b83b8 in OSD::shutdown (this=this@entry=0x14d41e0) at osd/OSD.cc:1307 #5 0x00000000006b9d4b in OSD::handle_signal (this=0x14d41e0, signum=<optimized out>) at osd/OSD.cc:952 #6 0x0000000000842438 in SignalHandler::entry (this=0x14d6760) at global/signal_handler.cc:224 #7 0x00007ffff79c2f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #8 0x00007ffff5ee2e1d in clone () from /lib/x86_64-linux-gnu/libc.so.6 (gdb)
Updated by Jérôme Poulin almost 11 years ago
Without debugger:
root@Ceph4:~# ceph-osd -i 4 -d --debug_ms=20 --debug_osd=20 --debug_journal=20 --debug_filestore 20 2013-06-11 15:38:46.246008 7fa18439e7c0 0 ceph version 0.61.3 (92b1e398576d55df8e5888dd1a9545ed3fd99532), process ceph-osd, pid 4238 starting osd.4 at :/0 osd_data /var/lib/ceph/osd/ceph-4 /dev/vgDebian/journal 2013-06-11 15:38:46.246791 7fa18439e7c0 10 -- :/0 rank.bind :/0 2013-06-11 15:38:46.246808 7fa18439e7c0 10 accepter.accepter.bind 2013-06-11 15:38:46.246836 7fa18439e7c0 10 accepter.accepter.bind bound on random port 0.0.0.0:6800/0 2013-06-11 15:38:46.246861 7fa18439e7c0 10 accepter.accepter.bind bound to 0.0.0.0:6800/0 2013-06-11 15:38:46.246878 7fa18439e7c0 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6800/4238 need_addr=1 2013-06-11 15:38:46.246891 7fa18439e7c0 10 -- :/0 rank.bind :/0 2013-06-11 15:38:46.246895 7fa18439e7c0 10 accepter.accepter.bind 2013-06-11 15:38:46.246904 7fa18439e7c0 10 accepter.accepter.bind bound on random port 0.0.0.0:6801/0 2013-06-11 15:38:46.246909 7fa18439e7c0 10 accepter.accepter.bind bound to 0.0.0.0:6801/0 2013-06-11 15:38:46.246918 7fa18439e7c0 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6801/4238 need_addr=1 2013-06-11 15:38:46.246924 7fa18439e7c0 10 -- :/0 rank.bind :/0 2013-06-11 15:38:46.246927 7fa18439e7c0 10 accepter.accepter.bind 2013-06-11 15:38:46.246935 7fa18439e7c0 10 accepter.accepter.bind bound on random port 0.0.0.0:6802/0 2013-06-11 15:38:46.246940 7fa18439e7c0 10 accepter.accepter.bind bound to 0.0.0.0:6802/0 2013-06-11 15:38:46.246949 7fa18439e7c0 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6802/4238 need_addr=1 2013-06-11 15:38:46.251148 7fa18439e7c0 5 filestore(/var/lib/ceph/osd/ceph-4) basedir /var/lib/ceph/osd/ceph-4 journal /dev/vgDebian/journal 2013-06-11 15:38:46.251217 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) mount fsid is 52443e60-7471-402a-b46f-b905843c816f 2013-06-11 15:38:46.315732 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount FIEMAP ioctl is supported and appears to work 2013-06-11 15:38:46.315756 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option 2013-06-11 15:38:46.316136 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount detected btrfs 2013-06-11 15:38:46.316153 7fa18439e7c0 20 filestore(/var/lib/ceph/osd/ceph-4) _do_clone_range 0~1 to 0 2013-06-11 15:38:46.316163 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs CLONE_RANGE ioctl is supported 2013-06-11 15:38:46.604933 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs SNAP_CREATE is supported 2013-06-11 15:38:46.605147 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs SNAP_DESTROY is supported 2013-06-11 15:38:46.605504 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs START_SYNC is supported (transid 164747) 2013-06-11 15:38:46.746549 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs WAIT_SYNC is supported 2013-06-11 15:38:46.756082 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs SNAP_CREATE_V2 is supported 2013-06-11 15:38:47.040322 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount syncfs(2) syscall fully supported (by glibc and kernel) 2013-06-11 15:38:47.040476 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount found snaps <3185973,3185974,3185975> 2013-06-11 15:38:47.040519 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) current/ seq was 3185975 2013-06-11 15:38:47.040526 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) most recent snap from <3185973,3185974,3185975> is 3185975 2013-06-11 15:38:47.040543 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) mount rolling back to consistent snap 3185975 2013-06-11 15:38:47.215428 7fa18439e7c0 5 filestore(/var/lib/ceph/osd/ceph-4) mount op_seq is 3185975 2013-06-11 15:38:47.345216 7fa18439e7c0 20 filestore (init)dbobjectmap: seq is 136166 2013-06-11 15:38:47.345262 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) open_journal at /dev/vgDebian/journal 2013-06-11 15:38:47.345303 7fa18439e7c0 0 filestore(/var/lib/ceph/osd/ceph-4) mount: PARALLEL journal mode explicitly enabled in conf 2013-06-11 15:38:47.345310 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) list_collections 2013-06-11 15:38:47.364347 7fa18439e7c0 10 journal journal_replay fs op_seq 3185975 2013-06-11 15:38:47.364359 7fa18439e7c0 2 journal open /dev/vgDebian/journal fsid 52443e60-7471-402a-b46f-b905843c816f fs_op_seq 3185975 2013-06-11 15:38:47.364402 7fa18439e7c0 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5368709120) 2013-06-11 15:38:47.364437 7fa17eea2700 20 filestore(/var/lib/ceph/osd/ceph-4) sync_entry waiting for max_interval 300.000000 sh: 1: /sbin/hdparm: not found 2013-06-11 15:38:47.366527 7fa18439e7c0 -1 journal _check_disk_write_cache: fclose error: (61) No data available 2013-06-11 15:38:47.366573 7fa18439e7c0 1 journal _open /dev/vgDebian/journal fd 17: 5368709120 bytes, block size 4096 bytes, directio = 0, aio = 0 2013-06-11 15:38:47.366586 7fa18439e7c0 10 journal read_header 2013-06-11 15:38:47.367168 7fa18439e7c0 10 journal header: block_size 4096 alignment 4096 max_size 5368709120 2013-06-11 15:38:47.367180 7fa18439e7c0 10 journal header: start 1175015424 2013-06-11 15:38:47.367182 7fa18439e7c0 10 journal write_pos 4096 2013-06-11 15:38:47.367185 7fa18439e7c0 10 journal open header.fsid = 52443e60-7471-402a-b46f-b905843c816f 2013-06-11 15:38:47.367495 7fa18439e7c0 2 journal read_entry 1175019520 : seq 3185975 57 bytes 2013-06-11 15:38:47.367773 7fa18439e7c0 2 journal No further valid entries found, journal is most likely valid 2013-06-11 15:38:47.367782 7fa18439e7c0 10 journal open reached end of journal. 2013-06-11 15:38:47.367810 7fa18439e7c0 2 journal No further valid entries found, journal is most likely valid 2013-06-11 15:38:47.367814 7fa18439e7c0 3 journal journal_replay: end of journal, done. 2013-06-11 15:38:47.367894 7fa18439e7c0 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5368709120) sh: 1: /sbin/hdparm: not found 2013-06-11 15:38:47.369835 7fa18439e7c0 -1 journal _check_disk_write_cache: fclose error: (61) No data available 2013-06-11 15:38:47.369884 7fa18439e7c0 1 journal _open /dev/vgDebian/journal fd 17: 5368709120 bytes, block size 4096 bytes, directio = 0, aio = 0 2013-06-11 15:38:47.370010 7fa18439e7c0 10 journal journal_start 2013-06-11 15:38:47.369997 7fa17e6a1700 10 journal write_thread_entry start 2013-06-11 15:38:47.370011 7fa17dea0700 10 journal write_finish_thread_entry enter 2013-06-11 15:38:47.370025 7fa17e6a1700 20 journal write_thread_entry going to sleep 2013-06-11 15:38:47.370027 7fa17dea0700 20 journal write_finish_thread_entry sleeping 2013-06-11 15:38:47.370282 7fa17be9c700 20 filestore(/var/lib/ceph/osd/ceph-4) flusher_entry start 2013-06-11 15:38:47.370312 7fa17be9c700 20 filestore(/var/lib/ceph/osd/ceph-4) flusher_entry sleeping 2013-06-11 15:38:47.370379 7fa18439e7c0 5 filestore(/var/lib/ceph/osd/ceph-4) umount /var/lib/ceph/osd/ceph-4 2013-06-11 15:38:47.370416 7fa17eea2700 20 filestore(/var/lib/ceph/osd/ceph-4) sync_entry force_sync set 2013-06-11 15:38:47.370446 7fa17eea2700 10 journal commit_start max_applied_seq 3185975, open_ops 0 2013-06-11 15:38:47.370450 7fa17eea2700 10 journal commit_start blocked, all open_ops have completed 2013-06-11 15:38:47.370452 7fa17eea2700 10 journal commit_start nothing to do 2013-06-11 15:38:47.370454 7fa17eea2700 10 journal commit_start 2013-06-11 15:38:47.370460 7fa17be9c700 20 filestore(/var/lib/ceph/osd/ceph-4) flusher_entry awoke 2013-06-11 15:38:47.370468 7fa17be9c700 20 filestore(/var/lib/ceph/osd/ceph-4) flusher_entry finish 2013-06-11 15:38:47.370638 7fa18439e7c0 10 journal journal_stop 2013-06-11 15:38:47.370696 7fa18439e7c0 1 journal close /dev/vgDebian/journal 2013-06-11 15:38:47.370711 7fa17e6a1700 20 journal write_thread_entry woke up 2013-06-11 15:38:47.370719 7fa17e6a1700 10 journal write_thread_entry finish 2013-06-11 15:38:47.370732 7fa17dea0700 10 journal write_finish_thread_entry exit 2013-06-11 15:38:47.372679 7fa18439e7c0 5 filestore(/var/lib/ceph/osd/ceph-4) test_mount basedir /var/lib/ceph/osd/ceph-4 journal /dev/vgDebian/journal^CSegmentation fault (core dumped) root@Ceph4:~#
Updated by Sage Weil almost 11 years ago
- Status changed from New to Resolved
This was a missed backport for an old fix. I pushed it to the cuttlefish branch and it will be included in .4. Thanks!
Actions