Project

General

Profile

Actions

Bug #11895

closed

Core dump while trying to look at an image whose parent is in pool to which the user does not have access

Added by Sam Matzek almost 9 years ago. Updated almost 9 years ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rbd
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Core dump while trying to look at an image whose parent is in pool to which the user does not have access.
Affected ceph version 0.80.7.

I hit this error while attempting to test out the rbd snapshot of VM ephemeral disk code and I had accidentally set the ceph 'glance' user to only have access to the vms pool and no access to the images pool.

When Glance hits this it kills the Glance worker process. In the unlikely scenario where someone as images with parents in other pools and Glance has access to the pool the image is in but not the parent, this could be used as a denial of service attack against Glance. A user could keep creating or updating a Glance image and specify the RBD location URL for this image to keep killing all of Glance's worker processes. It may also be possible to do this fast enough and make Glance keep spinning up more worker processes to replace the dead ones that you could possibly fill up the process table of the system.

I recreated this outside of OpenStack to narrow down the exact scenario and symptom. The standalone python program that is used in the recreation steps below is attached to this bug.

Note that sometimes when I would run the recreation program it would not dump out the native stack but rather just output 'Segmentation Fault'. The stack shown here is the way it normally failed for me.

Recreation steps:
  1. rbd -p vms ls --long
    NAME SIZE PARENT FMT PROT LOCK
    89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk 5120M images/5f428789-8a98-4407-b9a7-4a82d9f36c08@snap 2
    89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk@013f37c0413d4493bae831c26a3d1b34_to_be_deleted_by_glance 5120M images/5f428789-8a98-4407-b9a7-4a82d9f36c08@snap 2 yes
    89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk@58ad872dad7c4f368bee3c450ce82703_to_be_deleted_by_glance 5120M images/5f428789-8a98-4407-b9a7-4a82d9f36c08@snap 2 yes
    89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk_clone_013f37c0413d4493bae831c26a3d1b34 5120M vms/89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk@013f37c0413d4493bae831c26a3d1b34_to_be_deleted_by_glance 2
    89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk_clone_013f37c0413d4493bae831c26a3d1b34@snap 5120M vms/89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk@013f37c0413d4493bae831c26a3d1b34_to_be_deleted_by_glance 2 yes
    89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk_clone_58ad872dad7c4f368bee3c450ce82703 5120M vms/89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk@58ad872dad7c4f368bee3c450ce82703_to_be_deleted_by_glance 2
    89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk_clone_58ad872dad7c4f368bee3c450ce82703@snap 5120M vms/89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk@58ad872dad7c4f368bee3c450ce82703_to_be_deleted_by_glance 2 yes
  1. ceph auth caps client.glance mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=vms'
    updated caps for client.glance
  1. python testPerms.py vms 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk_clone_013f37c0413d4493bae831c26a3d1b34 snap
    Using pool vms, image 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk_clone_013f37c0413d4493bae831c26a3d1b34, snapshot snap
    ./log/SubsystemMap.h: In function 'bool ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread 3fff9d1b5e40 time 2015-06-05 17:36:20.909427
    ./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
    ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
    1: (()+0x42344) [0x3fff90272344]
    2: /lib64/librbd.so.1(ZN6librbd11close_imageEPNS_8ImageCtxE-0xcb2ec) [0x3fff902965a4]
    3: /lib64/librbd.so.1(_ZN6librbd11open_parentEPNS_8ImageCtxE-0xc5ab4) [0x3fff9029bedc]
    4: /lib64/librbd.so.1(_ZN6librbd14refresh_parentEPNS_8ImageCtxE-0xc54d8) [0x3fff9029c4c8]
    5: /lib64/librbd.so.1(_ZN6librbd12ictx_refreshEPNS_8ImageCtxE-0xd3518) [0x3fff9028e208]
    6: /lib64/librbd.so.1(_ZN6librbd10open_imageEPNS_8ImageCtxE-0xc65b0) [0x3fff9029b3d0]
    7: /lib64/librbd.so.1(_ZN6librbd11open_parentEPNS_8ImageCtxE-0xc6010) [0x3fff9029b980]
    8: /lib64/librbd.so.1(_ZN6librbd14refresh_parentEPNS_8ImageCtxE-0xc54d8) [0x3fff9029c4c8]
    9: /lib64/librbd.so.1(_ZN6librbd12ictx_refreshEPNS_8ImageCtxE-0xd3518) [0x3fff9028e208]
    10: /lib64/librbd.so.1(_ZN6librbd10open_imageEPNS_8ImageCtxE-0xc65b0) [0x3fff9029b3d0]
    11: /lib64/librbd.so.1(rbd_open-0xea0b0) [0x3fff902765c0]
    12: (()+0x80cc) [0x3fff960080cc]
    13: /lib64/libffi.so.6(ffi_call-0x18a80) [0x3fff960077e8]
    14: /usr/lib64/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc-0x28bfc) [0x3fff9604b0b4]
    15: (()+0xd994) [0x3fff9603d994]
    16: /lib64/libpython2.7.so.1.0(PyObject_Call-0x22570c) [0x3fff9cef4cf4]
    17: /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx-0x129680) [0x3fff9cff8630]
    18: /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx-0x1249fc) [0x3fff9cffd2d4]
    19: (()+0xd0654) [0x3fff9cf40654]
    20: /lib64/libpython2.7.so.1.0(PyObject_Call-0x22570c) [0x3fff9cef4cf4]
    21: (()+0x9aa90) [0x3fff9cf0aa90]
    22: /lib64/libpython2.7.so.1.0(PyObject_Call-0x22570c) [0x3fff9cef4cf4]
    23: (()+0x127d5c) [0x3fff9cf97d5c]
    24: (()+0x125d68) [0x3fff9cf95d68]
    25: /lib64/libpython2.7.so.1.0(PyObject_Call-0x22570c) [0x3fff9cef4cf4]
    26: /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx-0x129680) [0x3fff9cff8630]
    27: /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx-0x1249fc) [0x3fff9cffd2d4]
    28: /lib64/libpython2.7.so.1.0(PyEval_EvalCode-0x1248f4) [0x3fff9cffd3ec]
    29: /lib64/libpython2.7.so.1.0(PyRun_FileExFlags-0xf18f0) [0x3fff9d031350]
    30: /lib64/libpython2.7.so.1.0(PyRun_SimpleFileExFlags-0xef7ec) [0x3fff9d033574]
    31: /lib64/libpython2.7.so.1.0(PyRun_AnyFileExFlags-0xef0a4) [0x3fff9d033cec]
    32: /lib64/libpython2.7.so.1.0(Py_Main-0xd424c) [0x3fff9d04f514]
    33: python(main-0x1f958) [0x10000720]
    34: (()+0x444ec) [0x3fff9cb344ec]
    35: /lib64/libc.so.6(
    _libc_start_main-0x19ceb4) [0x3fff9cb34714]
    NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
    terminate called after throwing an instance of 'ceph::FailedAssertion'
    Aborted
  1. python testPerms.py vms 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk
    Using pool vms, image 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk, snapshot None
    ./log/SubsystemMap.h: In function 'bool ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread 3fff83775e40 time 2015-06-05 17:36:29.885671
    ./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
    ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
    1: (()+0x42344) [0x3fff71702344]
    2: /lib64/librbd.so.1(ZN6librbd11close_imageEPNS_8ImageCtxE-0xcb2ec) [0x3fff717265a4]
    3: /lib64/librbd.so.1(_ZN6librbd11open_parentEPNS_8ImageCtxE-0xc5ab4) [0x3fff7172bedc]
    4: /lib64/librbd.so.1(_ZN6librbd14refresh_parentEPNS_8ImageCtxE-0xc54d8) [0x3fff7172c4c8]
    5: /lib64/librbd.so.1(_ZN6librbd12ictx_refreshEPNS_8ImageCtxE-0xd3518) [0x3fff7171e208]
    6: /lib64/librbd.so.1(_ZN6librbd10open_imageEPNS_8ImageCtxE-0xc65b0) [0x3fff7172b3d0]
    7: /lib64/librbd.so.1(rbd_open-0xea0b0) [0x3fff717065c0]
    8: (()+0x80cc) [0x3fff7c5c80cc]
    9: /lib64/libffi.so.6(ffi_call-0x18a80) [0x3fff7c5c77e8]
    10: /usr/lib64/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc-0x28bfc) [0x3fff7c60b0b4]
    11: (()+0xd994) [0x3fff7c5fd994]
    12: /lib64/libpython2.7.so.1.0(PyObject_Call-0x22570c) [0x3fff834b4cf4]
    13: /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx-0x129680) [0x3fff835b8630]
    14: /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx-0x1249fc) [0x3fff835bd2d4]
    15: (()+0xd0654) [0x3fff83500654]
    16: /lib64/libpython2.7.so.1.0(PyObject_Call-0x22570c) [0x3fff834b4cf4]
    17: (()+0x9aa90) [0x3fff834caa90]
    18: /lib64/libpython2.7.so.1.0(PyObject_Call-0x22570c) [0x3fff834b4cf4]
    19: (()+0x127d5c) [0x3fff83557d5c]
    20: (()+0x125d68) [0x3fff83555d68]
    21: /lib64/libpython2.7.so.1.0(PyObject_Call-0x22570c) [0x3fff834b4cf4]
    22: /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx-0x129680) [0x3fff835b8630]
    23: /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx-0x1249fc) [0x3fff835bd2d4]
    24: /lib64/libpython2.7.so.1.0(PyEval_EvalCode-0x1248f4) [0x3fff835bd3ec]
    25: /lib64/libpython2.7.so.1.0(PyRun_FileExFlags-0xf18f0) [0x3fff835f1350]
    26: /lib64/libpython2.7.so.1.0(PyRun_SimpleFileExFlags-0xef7ec) [0x3fff835f3574]
    27: /lib64/libpython2.7.so.1.0(PyRun_AnyFileExFlags-0xef0a4) [0x3fff835f3cec]
    28: /lib64/libpython2.7.so.1.0(Py_Main-0xd424c) [0x3fff8360f514]
    29: python(main-0x1f958) [0x10000720]
    30: (()+0x444ec) [0x3fff830f44ec]
    31: /lib64/libc.so.6(
    _libc_start_main-0x19ceb4) [0x3fff830f4714]
    NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
    terminate called after throwing an instance of 'ceph::FailedAssertion'
    Aborted
  1. ceph auth caps client.glance mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=vms, allow rwx pool=images'
    updated caps for client.glance
  1. python testPerms.py vms 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk
    Using pool vms, image 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk, snapshot None
    made it to the end
  1. python testPerms.py vms 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk_clone_013f37c0413d4493bae831c26a3d1b34 snap
    Using pool vms, image 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk_clone_013f37c0413d4493bae831c26a3d1b34, snapshot snap
    made it to the end

Files

testPerms.py (512 Bytes) testPerms.py Sam Matzek, 06/05/2015 02:59 PM

Related issues 1 (0 open1 closed)

Is duplicate of rbd - Bug #10030: Crash when attempting to open non-existent parent imageResolvedJason Dillaman11/07/2014

Actions
Actions #1

Updated by Sage Weil almost 9 years ago

  • Priority changed from Normal to High
Actions #2

Updated by Jason Dillaman almost 9 years ago

  • Status changed from New to Need More Info

I believe this issue has since been addressed by #10030 in the v0.80.8 release of Firefly. Do you still see the issue on the most recent version of Firefly?

Actions #3

Updated by Sam Matzek almost 9 years ago

I'm currently on RHEL 7.0 ppc64 BE and I don't believe I can get v0.80.8 on there. I may try putting Fedora ppc64 0.80.8 on there from http://ppc.koji.fedoraproject.org/koji/buildinfo?buildID=297951 but as that could be very disruptive to my cluster I can't do that at in the next few days.

Is this a client side fix only, such that i could try 0.80.8 on a client and leave my mon and osds where they are? If so, I could more easily attempt a recreate using 0.80.8.

Actions #4

Updated by Jason Dillaman almost 9 years ago

It's a client-only fix. I would skip v0.80.8, since it contained a performance regression for RBD, and test against v0.80.9: http://ppc.koji.fedoraproject.org/koji/buildinfo?buildID=298878

Actions #5

Updated by Sam Matzek almost 9 years ago

I tired putting 0.80.9 on RHEL 7.1 ppc64 using the Fedora RPMs above it pulled in too many package dependencies that couldn't be resolved and conflicted with the RHEL packages. So unfortunately I can't try out v0.80.9 right now.

Actions #6

Updated by Sam Matzek almost 9 years ago

I recreated this issue with RHEL 7.1 x86 ceph 0.80.7.
I then upgraded ceph to 0.80.9.

The issue did not reproduce with 0.80.9 but instead, properly failed with an rbd exception like this:
rbd.PermissionError: error opening image 89026895-c8b1-42c5-b3f6-fb40b3e8202f_disk at snapshot None

I think this bug can be closed as a duplicate / already fixed in 0.80.9.

Thanks again.

Actions #7

Updated by Josh Durgin almost 9 years ago

  • Status changed from Need More Info to Duplicate

Thanks for checking!

Actions

Also available in: Atom PDF