Project

General

Profile

Actions

Bug #5303

closed

OSD segfaults on SIGINT

Added by Jérôme Poulin almost 11 years ago. Updated almost 11 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
4 - irritation
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is not the first time but interrupting the OSD with SIGINT (CTRL+C) causes a segmentation fault.

Cuttlefish 0.61.3

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffeeffd700 (LWP 4205)]
0x00000000006fb348 in exists (osd=4, this=0x0) at ./osd/OSDMap.h:301
301        return osd >= 0 && osd < max_osd && (osd_state[osd] & CEPH_OSD_EXISTS);
(gdb) bt
#0  0x00000000006fb348 in exists (osd=4, this=0x0) at ./osd/OSDMap.h:301
#1  is_up (osd=4, this=0x0) at ./osd/OSDMap.h:305
#2  OSDMap::get_inst (this=0x0, osd=4) at ./osd/OSDMap.h:351
#3  0x00000000006b238b in OSDService::prepare_to_stop (this=this@entry=0x14d57b8) at osd/OSD.cc:4087
#4  0x00000000006b83b8 in OSD::shutdown (this=this@entry=0x14d41e0) at osd/OSD.cc:1307
#5  0x00000000006b9d4b in OSD::handle_signal (this=0x14d41e0, signum=<optimized out>) at osd/OSD.cc:952
#6  0x0000000000842438 in SignalHandler::entry (this=0x14d6760) at global/signal_handler.cc:224
#7  0x00007ffff79c2f8e in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#8  0x00007ffff5ee2e1d in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) 

Actions #1

Updated by Jérôme Poulin almost 11 years ago

Without debugger:

root@Ceph4:~# ceph-osd -i 4 -d --debug_ms=20 --debug_osd=20 --debug_journal=20 --debug_filestore 20
2013-06-11 15:38:46.246008 7fa18439e7c0  0 ceph version 0.61.3 (92b1e398576d55df8e5888dd1a9545ed3fd99532), process ceph-osd, pid 4238
starting osd.4 at :/0 osd_data /var/lib/ceph/osd/ceph-4 /dev/vgDebian/journal
2013-06-11 15:38:46.246791 7fa18439e7c0 10 -- :/0 rank.bind :/0
2013-06-11 15:38:46.246808 7fa18439e7c0 10 accepter.accepter.bind
2013-06-11 15:38:46.246836 7fa18439e7c0 10 accepter.accepter.bind bound on random port 0.0.0.0:6800/0
2013-06-11 15:38:46.246861 7fa18439e7c0 10 accepter.accepter.bind bound to 0.0.0.0:6800/0
2013-06-11 15:38:46.246878 7fa18439e7c0  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6800/4238 need_addr=1
2013-06-11 15:38:46.246891 7fa18439e7c0 10 -- :/0 rank.bind :/0
2013-06-11 15:38:46.246895 7fa18439e7c0 10 accepter.accepter.bind
2013-06-11 15:38:46.246904 7fa18439e7c0 10 accepter.accepter.bind bound on random port 0.0.0.0:6801/0
2013-06-11 15:38:46.246909 7fa18439e7c0 10 accepter.accepter.bind bound to 0.0.0.0:6801/0
2013-06-11 15:38:46.246918 7fa18439e7c0  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6801/4238 need_addr=1
2013-06-11 15:38:46.246924 7fa18439e7c0 10 -- :/0 rank.bind :/0
2013-06-11 15:38:46.246927 7fa18439e7c0 10 accepter.accepter.bind
2013-06-11 15:38:46.246935 7fa18439e7c0 10 accepter.accepter.bind bound on random port 0.0.0.0:6802/0
2013-06-11 15:38:46.246940 7fa18439e7c0 10 accepter.accepter.bind bound to 0.0.0.0:6802/0
2013-06-11 15:38:46.246949 7fa18439e7c0  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6802/4238 need_addr=1
2013-06-11 15:38:46.251148 7fa18439e7c0  5 filestore(/var/lib/ceph/osd/ceph-4) basedir /var/lib/ceph/osd/ceph-4 journal /dev/vgDebian/journal
2013-06-11 15:38:46.251217 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) mount fsid is 52443e60-7471-402a-b46f-b905843c816f
2013-06-11 15:38:46.315732 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount FIEMAP ioctl is supported and appears to work
2013-06-11 15:38:46.315756 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option
2013-06-11 15:38:46.316136 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount detected btrfs
2013-06-11 15:38:46.316153 7fa18439e7c0 20 filestore(/var/lib/ceph/osd/ceph-4) _do_clone_range 0~1 to 0
2013-06-11 15:38:46.316163 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs CLONE_RANGE ioctl is supported
2013-06-11 15:38:46.604933 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs SNAP_CREATE is supported
2013-06-11 15:38:46.605147 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs SNAP_DESTROY is supported
2013-06-11 15:38:46.605504 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs START_SYNC is supported (transid 164747)
2013-06-11 15:38:46.746549 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs WAIT_SYNC is supported
2013-06-11 15:38:46.756082 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount btrfs SNAP_CREATE_V2 is supported
2013-06-11 15:38:47.040322 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount syncfs(2) syscall fully supported (by glibc and kernel)
2013-06-11 15:38:47.040476 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount found snaps <3185973,3185974,3185975>
2013-06-11 15:38:47.040519 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4)  current/ seq was 3185975
2013-06-11 15:38:47.040526 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4)  most recent snap from <3185973,3185974,3185975> is 3185975
2013-06-11 15:38:47.040543 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) mount rolling back to consistent snap 3185975
2013-06-11 15:38:47.215428 7fa18439e7c0  5 filestore(/var/lib/ceph/osd/ceph-4) mount op_seq is 3185975
2013-06-11 15:38:47.345216 7fa18439e7c0 20 filestore (init)dbobjectmap: seq is 136166
2013-06-11 15:38:47.345262 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) open_journal at /dev/vgDebian/journal
2013-06-11 15:38:47.345303 7fa18439e7c0  0 filestore(/var/lib/ceph/osd/ceph-4) mount: PARALLEL journal mode explicitly enabled in conf
2013-06-11 15:38:47.345310 7fa18439e7c0 10 filestore(/var/lib/ceph/osd/ceph-4) list_collections
2013-06-11 15:38:47.364347 7fa18439e7c0 10 journal journal_replay fs op_seq 3185975
2013-06-11 15:38:47.364359 7fa18439e7c0  2 journal open /dev/vgDebian/journal fsid 52443e60-7471-402a-b46f-b905843c816f fs_op_seq 3185975
2013-06-11 15:38:47.364402 7fa18439e7c0 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5368709120)
2013-06-11 15:38:47.364437 7fa17eea2700 20 filestore(/var/lib/ceph/osd/ceph-4) sync_entry waiting for max_interval 300.000000
sh: 1: /sbin/hdparm: not found
2013-06-11 15:38:47.366527 7fa18439e7c0 -1 journal _check_disk_write_cache: fclose error: (61) No data available
2013-06-11 15:38:47.366573 7fa18439e7c0  1 journal _open /dev/vgDebian/journal fd 17: 5368709120 bytes, block size 4096 bytes, directio = 0, aio = 0
2013-06-11 15:38:47.366586 7fa18439e7c0 10 journal read_header
2013-06-11 15:38:47.367168 7fa18439e7c0 10 journal header: block_size 4096 alignment 4096 max_size 5368709120
2013-06-11 15:38:47.367180 7fa18439e7c0 10 journal header: start 1175015424
2013-06-11 15:38:47.367182 7fa18439e7c0 10 journal  write_pos 4096
2013-06-11 15:38:47.367185 7fa18439e7c0 10 journal open header.fsid = 52443e60-7471-402a-b46f-b905843c816f
2013-06-11 15:38:47.367495 7fa18439e7c0  2 journal read_entry 1175019520 : seq 3185975 57 bytes
2013-06-11 15:38:47.367773 7fa18439e7c0  2 journal No further valid entries found, journal is most likely valid
2013-06-11 15:38:47.367782 7fa18439e7c0 10 journal open reached end of journal.
2013-06-11 15:38:47.367810 7fa18439e7c0  2 journal No further valid entries found, journal is most likely valid
2013-06-11 15:38:47.367814 7fa18439e7c0  3 journal journal_replay: end of journal, done.
2013-06-11 15:38:47.367894 7fa18439e7c0 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 5368709120)
sh: 1: /sbin/hdparm: not found
2013-06-11 15:38:47.369835 7fa18439e7c0 -1 journal _check_disk_write_cache: fclose error: (61) No data available
2013-06-11 15:38:47.369884 7fa18439e7c0  1 journal _open /dev/vgDebian/journal fd 17: 5368709120 bytes, block size 4096 bytes, directio = 0, aio = 0
2013-06-11 15:38:47.370010 7fa18439e7c0 10 journal journal_start
2013-06-11 15:38:47.369997 7fa17e6a1700 10 journal write_thread_entry start
2013-06-11 15:38:47.370011 7fa17dea0700 10 journal write_finish_thread_entry enter
2013-06-11 15:38:47.370025 7fa17e6a1700 20 journal write_thread_entry going to sleep
2013-06-11 15:38:47.370027 7fa17dea0700 20 journal write_finish_thread_entry sleeping
2013-06-11 15:38:47.370282 7fa17be9c700 20 filestore(/var/lib/ceph/osd/ceph-4) flusher_entry start
2013-06-11 15:38:47.370312 7fa17be9c700 20 filestore(/var/lib/ceph/osd/ceph-4) flusher_entry sleeping
2013-06-11 15:38:47.370379 7fa18439e7c0  5 filestore(/var/lib/ceph/osd/ceph-4) umount /var/lib/ceph/osd/ceph-4
2013-06-11 15:38:47.370416 7fa17eea2700 20 filestore(/var/lib/ceph/osd/ceph-4) sync_entry force_sync set
2013-06-11 15:38:47.370446 7fa17eea2700 10 journal commit_start max_applied_seq 3185975, open_ops 0
2013-06-11 15:38:47.370450 7fa17eea2700 10 journal commit_start blocked, all open_ops have completed
2013-06-11 15:38:47.370452 7fa17eea2700 10 journal commit_start nothing to do
2013-06-11 15:38:47.370454 7fa17eea2700 10 journal commit_start
2013-06-11 15:38:47.370460 7fa17be9c700 20 filestore(/var/lib/ceph/osd/ceph-4) flusher_entry awoke
2013-06-11 15:38:47.370468 7fa17be9c700 20 filestore(/var/lib/ceph/osd/ceph-4) flusher_entry finish
2013-06-11 15:38:47.370638 7fa18439e7c0 10 journal journal_stop
2013-06-11 15:38:47.370696 7fa18439e7c0  1 journal close /dev/vgDebian/journal
2013-06-11 15:38:47.370711 7fa17e6a1700 20 journal write_thread_entry woke up
2013-06-11 15:38:47.370719 7fa17e6a1700 10 journal write_thread_entry finish
2013-06-11 15:38:47.370732 7fa17dea0700 10 journal write_finish_thread_entry exit
2013-06-11 15:38:47.372679 7fa18439e7c0  5 filestore(/var/lib/ceph/osd/ceph-4) test_mount basedir /var/lib/ceph/osd/ceph-4 journal /dev/vgDebian/journal^CSegmentation fault (core dumped)
root@Ceph4:~# 
Actions #2

Updated by Sage Weil almost 11 years ago

  • Status changed from New to Resolved

This was a missed backport for an old fix. I pushed it to the cuttlefish branch and it will be included in .4. Thanks!

Actions

Also available in: Atom PDF