Project

General

Profile

Bug #22253

"rbd info" crashed: stack smashing detected

Added by Sebastian Wagner almost 3 years ago. Updated almost 3 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

Environment: quite small vstart cluster.

This is the stack trace:

#3  0x00007fffed44711c in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7fffed4bd441 "stack smashing detected") at fortify_fail.c:37
#4  0x00007fffed4470c0 in __stack_chk_fail () at stack_chk_fail.c:28
#5  0x00007ffff78f0beb in librbd::ImageCtx::perf_start (this=this@entry=0x555555b7bf70, name="librbd-8c39e2ae8944a-rbd-huge2") at /home/sebastian/Repos/ceph/src/librbd/ImageCtx.cc:397
#6  0x00007ffff78f3cb4 in librbd::ImageCtx::init (this=0x555555b7bf70) at /home/sebastian/Repos/ceph/src/librbd/ImageCtx.cc:275
#7  0x00007ffff799dacd in librbd::image::OpenRequest<librbd::ImageCtx>::send_register_watch (this=this@entry=0x555555b7fe60) at /home/sebastian/Repos/ceph/src/librbd/image/OpenRequest.cc:477
#8  0x00007ffff79a3102 in librbd::image::OpenRequest<librbd::ImageCtx>::handle_v2_apply_metadata (this=this@entry=0x555555b7fe60, result=result@entry=0x7fffb77fa374) at /home/sebastian/Repos/ceph/src/librbd/image/OpenRequest.cc:471
#9  0x00007ffff79a351f in librbd::util::detail::rados_state_callback<librbd::image::OpenRequest<librbd::ImageCtx>, &librbd::image::OpenRequest<librbd::ImageCtx>::handle_v2_apply_metadata, true> (c=<optimized out>, arg=0x555555b7fe60) at /home/sebastian/Repos/ceph/src/librbd/Utils.h:39
#10 0x00007ffff75d678d in librados::C_AioComplete::finish (this=0x7fffd0001b60, r=<optimized out>) at /home/sebastian/Repos/ceph/src/librados/AioCompletionImpl.h:169
#11 0x0000555555613949 in Context::complete (this=0x7fffd0001b60, r=<optimized out>) at /home/sebastian/Repos/ceph/src/include/Context.h:70
#12 0x00007fffeeab6010 in Finisher::finisher_thread_entry (this=0x555555acb3e8) at /home/sebastian/Repos/ceph/src/common/Finisher.cc:72
#13 0x00007fffee3a86ba in start_thread (arg=0x7fffb77fe700) at pthread_create.c:333
#14 0x00007fffed4353dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

gdb_session.log View (55.5 KB) Sebastian Wagner, 11/27/2017 02:31 PM

rbd_info_out.log View (59.7 KB) Sebastian Wagner, 11/27/2017 02:33 PM

client.admin.29881.log View (58.7 KB) Sebastian Wagner, 11/27/2017 02:34 PM

rbd_ls_out.log View (19.4 KB) Sebastian Wagner, 11/27/2017 02:35 PM

ceph.conf View (5.66 KB) Sebastian Wagner, 11/27/2017 02:37 PM

valgrind_out.log View (18.8 KB) Sebastian Wagner, 11/27/2017 03:29 PM

History

#1 Updated by Joao Eduardo Luis almost 3 years ago

  • Description updated (diff)

#2 Updated by Sebastian Wagner almost 3 years ago

my correct version number is:

git describe
$ v12.2.0-1124-g5e519ae

#3 Updated by Jason Dillaman almost 3 years ago

  • Status changed from New to Need More Info

@Sebastian: please retest on the latest available version. Your line numbers do not align with v12.2.0.

#4 Updated by Sebastian Wagner almost 3 years ago

I don't think it is easy to reproduce it, because

  • That RBD is untouched, thus no data was ever written to this RBD, no snapshots were created, it wasn't resized, etc.
  • I had restarted the cluster many times since the creation of this RBD.

#5 Updated by Jason Dillaman almost 3 years ago

@Sebastian: without line numbers that actually align to the code at the mentioned version, there really isn't much I can do to assist. Perhaps attempt to run it through "valgrind --tool=memcheck rbd XYZ".

#6 Updated by Sebastian Wagner almost 3 years ago

Added valgrind output.

@Jason, should I recompile and retest on the latest luminous branch or on v12.2.1?

#7 Updated by Jason Dillaman almost 3 years ago

@Sebastian: that Valgrind output doesn't help since it failed on an "unknown instruction" error. Can you reproduce on distro or Ceph-provided packages instead of your home-grown build?

#8 Updated by Sebastian Wagner almost 3 years ago

Jason Dillaman wrote:

Can you reproduce on distro or Ceph-provided packages instead of your home-grown build?

If there is a documentation of how to run a cluster created by vstart.sh with distro packages, I'd give it a try.

#9 Updated by Jason Dillaman almost 3 years ago

@Sebastian: vstart is for development. Just install the Ceph client packages on a VM, copy the vstart-generated ceph.conf to that host, and retest.

#10 Updated by Sebastian Wagner almost 3 years ago

Jason Dillaman wrote:

@Sebastian: vstart is for development. Just install the Ceph client packages on a VM, copy the vstart-generated ceph.conf to that host, and retest.

Ok, but I also have to make sure, daemons are using the existing data, e.g OSDs.

#11 Updated by Jason Dillaman almost 3 years ago

@Sebastian: yup, that's why you would copy the "ceph.conf" so that the VM can connect to your vstart-created cluster.

#12 Updated by Sebastian Wagner almost 3 years ago

So, I recompiled 12.2.1 and can no longer reproduce this one. seems to be gone now.

#13 Updated by Jason Dillaman almost 3 years ago

  • Status changed from Need More Info to Can't reproduce

Also available in: Atom PDF