Project

General

Profile

Actions

Bug #10294

closed

ceph-osd --mkfs segfaults when journal dev is a symlink

Added by Hector Martin over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Version: git master.

This happens when activating an OSD with ceph-disk normally (ceph-disk activate /dev/sdd1)

ceph-osd command line invoked by ceph-disk:
/usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 8 --monmap /var/lib/ceph/tmp/mnt.iuEXBB/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.iuEXBB --osd-journal /var/lib/ceph/tmp/mnt.iuEXBB/journal --osd-uuid f643e2ff-bec1-49ae-996a-fa85b40a2cbd --keyring /var/lib/ceph/tmp/mnt.iuEXBB/keyring

Valgrind output:
20972 Invalid read of size 1
20972 at 0x4C2E781: __strncpy_sse2_unaligned (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
20972 by 0xB88679: block_device_support_discard(char const*) (in /usr/bin/ceph-osd)
20972 by 0xA4611A: FileJournal::_open_block_device() (in /usr/bin/ceph-osd)
20972 by 0xA47429: FileJournal::_open(bool, bool) (in /usr/bin/ceph-osd)
20972 by 0xA4D44F: FileJournal::check() (in /usr/bin/ceph-osd)
20972 by 0x900640: FileStore::mkjournal() (in /usr/bin/ceph-osd)
20972 by 0x8EFCFE: FileStore::mkfs() (in /usr/bin/ceph-osd)
20972 by 0x67C7E9: OSD::mkfs(CephContext*, ObjectStore*, std::string const&, uuid_d, int) (in /usr/bin/ceph-osd)
20972 by 0x63F78E: main (in /usr/bin/ceph-osd)
20972 Address 0x0 is not stack'd, malloc'd or (recently) free'd
20972
  • Caught signal (Segmentation fault) **

The bug is here:
https://github.com/ceph/ceph/blob/7d299528b54d3beaf14f12f7592c1a29b6cad6f0/src/common/blkdev.cc#L28

If devname is a symlink, strstr fails to find "sd" and strncpy segfaults on the NULL pointer.

This is the wrong approach; munging filenames to try to guess the right path under /sys/block/ is not a good idea. It makes a lot more sense to stat() the device and look it up in /sys/dev/block/<major>:<minor>/. It should either try both /sys/dev/block/maj:min/queue/discard_granularity and /sys/dev/block/maj:min/../queue/discard_granularity (the latter for partitions), or look for the presence of a /sys/dev/block/maj:min/partition file first to decide which path to try.

Actions #1

Updated by Sage Weil over 9 years ago

  • Status changed from New to Resolved

fixed in master

Actions

Also available in: Atom PDF