Bug #8912
closedlibrbd segfaults when creating new image (rbd-ephemeral-clone-stable-icehouse)
0%
Description
Background:
Installed openstack 2014.1.1 on Ubuntu 14.04 ( apt-get update & upgrade)
- glance with ceph
- nova with ceph
- installed without cinder
I can upload images to glance, I can also boot new instances from ceph
Then i delete and replace /usr/lib/python2.7/dist-packages/nova with nova folder from angdraug git repo (branch rbd-ephemeral-clone-stable-icehouse)
Reboot hypervisor
- Instances (backed on ceph ) are starting but cannot launch new instances
nova-compute dies with:
nova-compute18324: segfault at 125 ip 00007fc1502c27d1 sp 00007fc13effcd58 error 4 in librbd.so.1.0.0[7fc150290000+e8000]
Backtrace:
http://paste.openstack.org/show/86443/
Ceph versions:
- default Ubuntu 14.04 ceph version
- 0.80.2
- 0.80.4
Debug:
I have reproduced the issue outside nova-compute here: https://github.com/Lupul/nova-ceph-ephemeral-clone-test
(using a non-existing rbd image but with a existing image there is no problem)
- branch: test-ok never faults
- branch: test-fault always faults
librbd fails here:
/usr/lib/python2.7/dist-packages/rbd.py line 364:
ret = self.librbd.rbd_open_read_only(ioctx.io, c_char_p(name), byref(self.image), c_char_p(snapshot))
Updated by Josh Durgin over 9 years ago
- Status changed from New to In Progress
Excellent report, your reproducer causes the same crash for me.
Updated by Josh Durgin over 9 years ago
- Backport set to dumpling, firefly
Looks like it was a race condition in a previously little-used error path.
Updated by Josh Durgin over 9 years ago
- Status changed from In Progress to Fix Under Review
Updated by Sage Weil over 9 years ago
- Status changed from Fix Under Review to Pending Backport
- Source changed from other to Community (user)
Updated by Sage Weil over 9 years ago
- Status changed from Pending Backport to Resolved
Updated by Josh Durgin over 9 years ago
For better searchability, the backtrace for this crash is:
#0 ceph::log::SubsystemMap::should_gather (this=0x4ad5170, sub=<optimized out>, level=11) at ./log/SubsystemMap.h:63 #1 0x00007f985659b2db in ObjectCacher::flusher_entry (this=0x65a3da0) at osdc/ObjectCacher.cc:1431 #2 0x00007f98565ab4cd in ObjectCacher::FlusherThread::entry (this=<optimized out>) at osdc/ObjectCacher.h:358 #3 0x00007f98d45fa182 in start_thread (arg=0x7f9855d0f700) at pthread_create.c:312 #4 0x00007f98d432730d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111