Project

General

Profile

Actions

Backport #17445

closed

jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when doing diff on an image (cache tiering))

Added by Bart Vanbrabant over 7 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Release:
jewel
Pull request ID:
Crash signature (v1):
Crash signature (v2):


Files

difflog.txt (151 KB) difflog.txt Bart Vanbrabant, 09/30/2016 08:09 AM
core.lzma (514 KB) core.lzma Bart Vanbrabant, 09/30/2016 01:50 PM
Actions #1

Updated by Jason Dillaman over 7 years ago

  • Status changed from New to Need More Info
  • Priority changed from Normal to High

@Bart: would it be possible for you to install the debug packages (librados2-dbg, librbd1-dbg, ceph-common-dbg) and recreate? Due to address space randomization, I won't be able to determine where this crashed without a resolved backtrace. If you have the coredump, you could use ceph-post-file to submit it to us for analysis (just provide your .deb version information and the unique id assigned to your posting). Thanks.

Actions #2

Updated by Bart Vanbrabant over 7 years ago

Jason Dillaman wrote:

@Bart: would it be possible for you to install the debug packages (librados2-dbg, librbd1-dbg, ceph-common-dbg) and recreate? Due to address space randomization, I won't be able to determine where this crashed without a resolved backtrace. If you have the coredump, you could use ceph-post-file to submit it to us for analysis (just provide your .deb version information and the unique id assigned to your posting). Thanks.

I installed the dbg packages but my backtrace still does not contain symbol names. I attached a zipped core dump to this comment. The coredump is from a 14.04 ubuntu with the mitaka cloud archive. All deb have this version: 10.2.2-0ubuntu0.16.04.2~cloud0

Actions #3

Updated by Jason Dillaman over 7 years ago

#0  0x00007f96d72fa1fb in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
37    ../nptl/sysdeps/unix/sysv/linux/pt-raise.c: No such file or directory.
(gdb) bt
#0  0x00007f96d72fa1fb in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x00007f96ea94f6c5 in reraise_fatal (signum=11) at global/signal_handler.cc:71
#2  handle_fatal_signal (signum=11) at global/signal_handler.cc:133
#3  <signal handler called>
#4  calc_snap_set_diff (cct=<optimized out>, snap_set=..., start=<optimized out>, end=<optimized out>, diff=0x7ffc8ec97ab0, 
    end_size=<optimized out>, end_exists=0x7ffc8ec97a8f) at librados/snap_set_diff.cc:41
#5  0x00007f96e0cb5d81 in compute_diffs (diffs=0x7ffc8ec97aa0, this=0x7f96f49518f0) at librbd/DiffIterate.cc:132
#6  librbd::(anonymous namespace)::C_DiffObject::finish (this=0x7f96f49518f0, r=0) at librbd/DiffIterate.cc:89
#7  0x00007f96e0c9b0f9 in Context::complete (this=0x7f96f49518f0, r=<optimized out>) at ./include/Context.h:64
#8  0x00007f96e0e82013 in OrderedThrottle::complete_pending_ops (this=this@entry=0x7ffc8ec98010) at common/Throttle.cc:550
#9  0x00007f96e0e82721 in OrderedThrottle::start_op (this=0x7ffc8ec98010, on_finish=0x7f96f48ad0a0) at common/Throttle.cc:488
#10 0x00007f96e0cb7645 in send (this=0x7f96f48ad0a0) at librbd/DiffIterate.cc:64
#11 librbd::DiffIterate::execute (this=this@entry=0x7ffc8ec98180) at librbd/DiffIterate.cc:319
#12 0x00007f96e0ce28b0 in librbd::diff_iterate (ictx=0x7f96f490ca50, fromsnapname=fromsnapname@entry=0x0, off=off@entry=0, 
    len=107374182400, include_parent=include_parent@entry=true, whole_object=whole_object@entry=false, 
    cb=0x7f96ea8bcfe0 <rbd::action::diff::diff_cb(uint64_t, size_t, int, void*)>, arg=0x7ffc8ec983e0) at librbd/internal.cc:2664
#13 0x00007f96e0c95d20 in librbd::Image::diff_iterate2 (this=this@entry=0x7ffc8ec98380, fromsnapname=fromsnapname@entry=0x0, 
    ofs=ofs@entry=0, len=<optimized out>, include_parent=include_parent@entry=true, whole_object=whole_object@entry=false, 
    cb=cb@entry=0x7f96ea8bcfe0 <rbd::action::diff::diff_cb(uint64_t, size_t, int, void*)>, arg=arg@entry=0x7ffc8ec983e0)
    at librbd/librbd.cc:1066
#14 0x00007f96ea8bc720 in do_diff (f=0x0, whole_object=false, fromsnapname=0x0, image=...) at tools/rbd/action/Diff.cc:66
#15 rbd::action::diff::execute (vm=...) at tools/rbd/action/Diff.cc:125
#16 0x00007f96ea8a6bb7 in rbd::Shell::execute (this=<optimized out>, cmdline_arguments=...) at tools/rbd/Shell.cc:163
#17 0x00007f96ea8781f2 in main (argc=<optimized out>, argv=<optimized out>) at tools/rbd/rbd.cc:19
Actions #4

Updated by Jason Dillaman over 7 years ago

The "snaps" vector in "librados::clone_info_t" is unexpectedly empty for cloneid 20 and 26.

{
  clones = std::vector of length 3, capacity 4 = {{
      cloneid = 20, 
      snaps = std::vector of length 0, capacity 0, 
      overlap = std::vector of length 0, capacity 0, 
      size = 8388608
    }, {
      cloneid = 26, 
      snaps = std::vector of length 0, capacity 0, 
      overlap = std::vector of length 1, capacity 1 = {{
          first = 499712, 
          second = 548864
        }}, 
      size = 8388608
    }, {
      cloneid = 18446744073709551614, 
      snaps = std::vector of length 0, capacity 0, 
      overlap = std::vector of length 0, capacity 0, 
      size = 8388608
    }}, 
  seq = 26
}

@Bart: did this image have two snapshots on it that have since been deleted? I wonder if this is just a race condition between deleting a snapshot and the OSD snap trimmer cleaning up.

Actions #5

Updated by Bart Vanbrabant over 7 years ago

Jason Dillaman wrote:

The "snaps" vector in "librados::clone_info_t" is unexpectedly empty for cloneid 20 and 26.

[...]

@Bart: did these image have two snapshots on it that have since been deleted? I wonder if this is just a race condition between deleting a snapshot and the OSD snap trimmer cleaning up.

There are other images that also have this. The images are managed by an openstack so not sure how they are used and what snapshots were there. rbd info gives me this:

rbd info volumes/volume-c1261c3c-b781-4707-b31d-9ac9dd188744
rbd image 'volume-c1261c3c-b781-4707-b31d-9ac9dd188744':
size 102400 MB in 12800 objects
order 23 (8192 kB objects)
block_name_prefix: rbd_data.1bec958e548a6
format: 2
features: layering, striping
flags:
parent: images/18315266-7077-424e-a1fd-341d4f5fbafd@snap
overlap: 2252 MB
stripe unit: 8192 kB
stripe count: 1

rbd snap list show two snapshots:

rbd snap list volumes/volume-c1261c3c-b781-4707-b31d-9ac9dd188744
SNAPID NAME SIZE
20 snapshot-76556812-02f9-4aa3-a237-e1dc4e18e302 102400 MB
26 snapshot-76b6fcc9-3d62-4c9a-8206-7cbe95a29338 102400 MB

Actions #6

Updated by Jason Dillaman over 7 years ago

  • Project changed from rbd to Ceph
  • Status changed from Need More Info to 12

Moving to core team to verify if the empty "snaps" vector in "librados::clone_info_t" is a newly introduced issue in the OSDs.

Actions #7

Updated by Samuel Just over 7 years ago

What ceph version is running on the osds?

Actions #8

Updated by Bart Vanbrabant over 7 years ago

Samuel Just wrote:

What ceph version is running on the osds?

ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)

Actions #9

Updated by Sage Weil over 7 years ago

  • Priority changed from High to Urgent
Actions #10

Updated by Kjetil Joergensen about 7 years ago

I suspect we're hitting the same.

Thread 1 "rbd.original" received signal SIGSEGV, Segmentation fault.
calc_snap_set_diff (cct=<optimized out>, snap_set=..., start=<optimized out>, end=<optimized out>, diff=0x7fffffffc100, end_size=<optimized out>, end_exists=0x7fffffffc0ac) at librados/snap_set_diff.cc:41
41    in librados/snap_set_diff.cc
(gdb) bt
#0  calc_snap_set_diff (cct=<optimized out>, snap_set=..., start=<optimized out>, end=<optimized out>, diff=0x7fffffffc100, end_size=<optimized out>, end_exists=0x7fffffffc0ac) at librados/snap_set_diff.cc:41
#1  0x00007fffee54d43d in librbd::(anonymous namespace)::C_DiffObject::compute_diffs (diffs=0x7fffffffc0e0, this=0x55555ebf29e0) at librbd/DiffIterate.cc:130
#2  librbd::(anonymous namespace)::C_DiffObject::finish (this=0x55555ebf29e0, r=0) at librbd/DiffIterate.cc:87
#3  0x00007fffee52f7b9 in Context::complete (this=0x55555ebf29e0, r=<optimized out>) at ./include/Context.h:64
#4  0x00007fffee74f5b3 in OrderedThrottle::complete_pending_ops (this=0x7fffffffc760) at common/Throttle.cc:550
#5  0x00007fffee74fdbd in OrderedThrottle::start_op (this=0x7fffffffc760, on_finish=<optimized out>) at common/Throttle.cc:491
#6  0x00007fffee54c5c0 in librbd::(anonymous namespace)::C_DiffObject::send (this=0x55555ec2edd0) at librbd/DiffIterate.cc:62
#7  librbd::DiffIterate::execute (this=this@entry=0x7fffffffc8f0) at librbd/DiffIterate.cc:317
#8  0x00007fffee57f4f4 in librbd::diff_iterate (ictx=ictx@entry=0x55555ebcff60, fromsnapname=fromsnapname@entry=0x0, off=off@entry=0, len=len@entry=74444701696, include_parent=<optimized out>, whole_object=<optimized out>, cb=0x5555556cf0e0 <rbd::action::disk_usage::disk_usage_callback(uint64_t, size_t, int, void*)>, arg=0x7fffffffd4a0)
    at librbd/internal.cc:2760
#9  0x00007fffee51d465 in librbd::Image::diff_iterate2 (this=this@entry=0x7fffffffd490, fromsnapname=fromsnapname@entry=0x0, ofs=ofs@entry=0, len=len@entry=74444701696, include_parent=include_parent@entry=false, whole_object=whole_object@entry=true, cb=0x5555556cf0e0 <rbd::action::disk_usage::disk_usage_callback(uint64_t, size_t, int, void*)>,
    arg=0x7fffffffd4a0) at librbd/librbd.cc:1075
#10 0x00005555556cf477 in rbd::action::disk_usage::compute_image_disk_usage (name="_qa-staging_report1_partial_db", snap_name="2017-01-20.001", from_snap_name="", image=..., size=74444701696, tbl=..., f=0x0, used_size=0x7fffffffd4a0) at tools/rbd/action/DiskUsage.cc:59
#11 0x00005555556d2770 in rbd::action::disk_usage::do_disk_usage (f=<optimized out>, from_snapname=<optimized out>, snapname=0x0, imgname=0x55555eb65720 "_qa-staging_report1_partial_db", io_ctx=..., rbd=...) at tools/rbd/action/DiskUsage.cc:176
#12 rbd::action::disk_usage::execute (vm=...) at tools/rbd/action/DiskUsage.cc:278
#13 0x00005555556b45a8 in rbd::Shell::execute (this=<optimized out>, cmdline_arguments=...) at tools/rbd/Shell.cc:163
#14 0x0000555555683d31 in main (argc=<optimized out>, argv=<optimized out>) at tools/rbd/rbd.cc:19
(gdb) print r
$4 = {cloneid = 1871, snaps = std::vector of length 0, capacity 0, overlap = std::vector of length 0, capacity 0, size = 1798144}

In terms of versioning; the cluster have gone through hammer, jewel (various versions), jewel v10.2.5. I couldn't tell you which specific version of jewel this were introduced on. Clients/osds/monitors are on v10.2.5, and generally follow in relatively close lock-step.

In terms of deleted snapshots - absolutely, we essentially have a process which snapshot/munge's on a timer, some snapshots ends up getting images cloned from them. Snapshots will be retired after a while, cloned images will be retired after a while. ("A while" may be a long duration).

Actions #11

Updated by Kjetil Joergensen about 7 years ago

This particular snapshot were created on the 20th of January, and I'm relatively certain clients/osds/monitors/etc. ran v10.2.5 at the time.

Actions #12

Updated by Kjetil Joergensen about 7 years ago

FWIW: in our case, the rbd pool is tiered in write-back mode.

Actions #13

Updated by Sage Weil almost 7 years ago

  • Project changed from Ceph to rbd
Actions #14

Updated by Sage Weil almost 7 years ago

  • Project changed from rbd to Ceph
Actions #15

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category set to Snapshots
  • Component(RADOS) OSD, librados added

Note the second reporter confirms this is with cache tiering. Rather suspect that's got more to do with it than snapshots, but not sure since we don't have that data point on the OP.

Actions #16

Updated by Bart Vanbrabant almost 7 years ago

Greg Farnum wrote:

Note the second reporter confirms this is with cache tiering. Rather suspect that's got more to do with it than snapshots, but not sure since we don't have that data point on the OP.

I can confirm this is for us also on tiered pool.

Actions #17

Updated by Sage Weil almost 7 years ago

  • Subject changed from Segfault when doing diff on an image to rbd cli segfault when doing diff on an image (cache tiering)
  • Priority changed from Urgent to High
Actions #18

Updated by Xuehan Xu almost 7 years ago

I encountered the same promblem.
I debugged a little, and found that this might have something to do with the "cache evict" and "list_snaps" operations.

I debugged the "core" file of the process with gdb, and confirmed that the object that caused the segmentation fault is rbd_data.d18d71b948ac7.000000000000062e, just as the following logs indicates:

(gdb) f 4
#4 calc_snap_set_diff (cct=<optimized out>, snap_set=..., start=<optimized out>, end=<optimized out>, diff=0x7ffed23a4640, end_size=<optimized out>, end_exists=0x7ffed23a461f)
at librados/snap_set_diff.cc:41
41 a = r->snaps0;
(gdb) p r
$1 = {cloneid = 22, snaps = std::vector of length 0, capacity 0, overlap = std::vector of length 2, capacity 2 = {{first = 0, second = 786432}, {first = 1523712, second = 2670592}},
size = 4194304}
(gdb) f 5
#5 0x00007fa87a4359c4 in compute_diffs (diffs=0x7ffed23a4630, this=0x7fa88f196820) at librbd/DiffIterate.cc:130
130 &end_exists);
(gdb) p m_oid
$2 = "rbd_data.d18d71b948ac7.", '0' <repeats 13 times>, "62e"

Then we checked the cache tier osd's log:

2017-07-14 18:27:11.122472 7f91a365f700 10 osd.58.objecter ms_dispatch 0x7f91e2a9c140 osd_op_reply(2877166 rbd_data.d18d71b948ac7.000000000000062e [copy-get max 8388608] v0'0 uv47138 ondisk = 0) v7
2017-07-14 18:27:11.122514 7f91b395d700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164119 lcod 2160'164120 mlcod 2160'164120 active+clean] process_copy_chunk 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 tid 2877166 (0) Success
2017-07-14 18:27:11.129590 7f91b395d700 10 osd.58.objecter _op_submit oid rbd_data.d18d71b948ac7.000000000000062e '@8' '@8' [assert-version v47138,copy-get max 8388608] tid 2877168 osd.0
2017-07-14 18:27:11.129602 7f91b395d700 1 -- 10.142.121.179:0/24945 --> 10.142.121.142:6824/6246 -- osd_op(osd.58.789:2877168 8.ce3acb8b rbd_data.d18d71b948ac7.000000000000062e [assert-version v47138,copy-get max 8388608] snapc 0=[] ack+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e2160) v7 -- ?+0 0x7f91ee305180 con 0x7f921046e880
2017-07-14 18:27:11.133206 7f91a365f700 1 -- 10.142.121.179:0/24945 <== osd.0 10.142.121.142:6824/6246 149 ==== osd_op_reply(2877168 rbd_data.d18d71b948ac7.000000000000062e [assert-version v47138,copy-get max 8388608] v0'0 uv47138 ondisk = 0) v7 ==== 201+0+119 (2793013310 0 570526743) 0x7f91fb306680 con 0x7f921046e880
2017-07-14 18:27:11.133220 7f91a365f700 10 osd.58.objecter ms_dispatch 0x7f91e2a9c140 osd_op_reply(2877168 rbd_data.d18d71b948ac7.000000000000062e [assert-version v47138,copy-get max 8388608] v0'0 uv47138 ondisk = 0) v7
2017-07-14 18:27:11.133264 7f91b395d700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] process_copy_chunk 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 tid 2877168 (0) Success
2017-07-14 18:27:11.133475 7f91b395d700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] finish_promote 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 r=0 uv47138
2017-07-14 18:27:11.133495 7f91b395d700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] simple_opc_create 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:11.133529 7f91b395d700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] finish_ctx 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 0x7f91eb158000 op promote
2017-07-14 18:27:11.133612 7f91b395d700 7 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=82 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] issue_repop rep_tid 29670336 o 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:11.133722 7f91b395d700 1 -- 10.143.208.51:6802/3024945 --> 10.143.208.16:6819/4176877 -- osd_repop(osd.58.0:29670336 6.38b 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 v 2160'164122) v1 -- ?+676 0x7f91e84e6600 con 0x7f91f58c2580
2017-07-14 18:27:11.133770 7f91b395d700 1 -- 10.143.208.51:6802/3024945 --> 10.143.208.50:6800/2039335 -- osd_repop(osd.58.0:29670336 6.38b 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 v 2160'164122) v1 -- ?+676 0x7f91e84e8a00 con 0x7f91ebf70280
2017-07-14 18:27:11.133959 7f91b395d700 20 snap_mapper.add_oid 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 b,16
.
.
.
.
.
.
.
.
.
.
.
2017-07-14 18:27:14.583134 7f91b215a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164124 (2133'161077,2160'164124] local-les=1977 n=82 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 luod=2160'164122 lua=2160'164122 crt=2160'164121 lcod 2160'164121 mlcod 2160'164121 active+clean] agent_maybe_evict evicting 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16(2160'164122 osd.58.0:29670336 [16,b] data_digest s 4194304 uv 47138 dd f8fc1a50)
2017-07-14 18:27:14.583143 7f91b215a700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164124 (2133'161077,2160'164124] local-les=1977 n=82 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 luod=2160'164122 lua=2160'164122 crt=2160'164121 lcod 2160'164121 mlcod 2160'164121 active+clean] simple_opc_create 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:14.583209 7f91b215a700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164124 (2133'161077,2160'164124] local-les=1977 n=82 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 luod=2160'164122 lua=2160'164122 crt=2160'164121 lcod 2160'164121 mlcod 2160'164121 active+clean] finish_ctx 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 0x7f91edf66000 op delete
2017-07-14 18:27:14.583242 7f91b215a700 7 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164124 (2133'161077,2160'164124] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 luod=2160'164122 lua=2160'164122 crt=2160'164121 lcod 2160'164121 mlcod 2160'164121 active+clean] issue_repop rep_tid 29670402 o 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:14.583270 7f91b215a700 1 -- 10.143.208.51:6802/3024945 --> 10.143.208.16:6819/4176877 -- osd_repop(osd.58.0:29670402 6.38b 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 v 2160'164125) v1 -- ?+248 0x7f91e4b6ac00 con 0x7f91f58c2580
2017-07-14 18:27:14.583390 7f91b215a700 1 -- 10.143.208.51:6802/3024945 --> 10.143.208.50:6800/2039335 -- osd_repop(osd.58.0:29670402 6.38b 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 v 2160'164125) v1 -- ?+248 0x7f91e98c1000 con 0x7f91ebf70280
2017-07-14 18:27:14.583399 7f91b215a700 20 snap_mapper.remove_oid 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:14.583562 7f91b215a700 20 snap_mapper.get_snaps 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 b,16
.
.
.
.
.
.
.
.
.
.
2017-07-14 18:28:24.616859 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] do_op osd_op(client.3535109.0:1996 6.ce3acb8b rbd_data.d18d71b948ac7.000000000000062e [list-snaps] snapc 0=[] ack+read+known_if_redirected e2160) v7 may_read -> read-ordered flags ack+read+known_if_redirected
2017-07-14 18:28:24.616876 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] get_object_context: found obc in cache: 0x7f91f94dc280
2017-07-14 18:28:24.616883 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] get_object_context: 0x7f91f94dc280 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head rwstate(none n=0 w=0) oi: 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head(2117'157947 client.596209.0:3574705 dirty s 4194304 uv 157947) ssc: 0x7f92032ca140 snapset: 16=[16,b]:[16]+head
2017-07-14 18:28:24.616894 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] find_object_context 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:snapdir @snapdir oi=6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head(2117'157947 client.596209.0:3574705 dirty s 4194304 uv 157947)
2017-07-14 18:28:24.616905 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] agent_choose_mode flush_mode: idle evict_mode: idle num_objects: 76 num_bytes: 296751474 num_objects_dirty: 38 num_objects_omap: 0 num_dirty: 38 num_user_objects: 72 num_user_bytes: 296751104 pool.info.target_max_bytes: 400000000000 pool.info.target_max_objects: 1000000
2017-07-14 18:28:24.616912 7f91ba16a700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] agent_choose_mode dirty 0.400943 full 0.759682
2017-07-14 18:28:24.616925 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] get_object_context: found obc in cache: 0x7f91f94dc500
2017-07-14 18:28:24.616931 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] get_object_context: 0x7f91f94dc500 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 rwstate(none n=0 w=0) oi: 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16(0'0 unknown.0.0:0 [] s 0 uv 0) ssc: 0x7f92032ca140 snapset: 16=[16,b]:[16]+head
2017-07-14 18:28:24.616941 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] clone_oid 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 obc 0x7f91f94dc500
2017-07-14 18:28:24.617015 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] execute_ctx 0x7f91e323ea00
2017-07-14 18:28:24.617023 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] do_op 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head [list-snaps] ov 2117'157947
2017-07-14 18:28:24.617030 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] taking ondisk_read_lock
2017-07-14 18:28:24.617036 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] taking ondisk_read_lock for src 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:28:24.617053 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] do_osd_op 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head [list-snaps]
2017-07-14 18:28:24.617060 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] do_osd_op list-snaps
2017-07-14 18:28:24.617071 7f91ba16a700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] clone 16 snaps []
2017-07-14 18:28:24.617079 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] dropping ondisk_read_lock
2017-07-14 18:28:24.617182 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] dropping ondisk_read_lock for src 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16

It showed that rbd_data.d18d71b948ac7.000000000000062e:16 got promoted at about 2017-07-14 18:27:11, at which time the "snaps" filed of its object context was still [b,16]. Then, at about 2017-07-14 18:27:14, it got evicted.
And then, at about 2017-07-14 18:28:24, a "list-snaps" request came in, and got the empty snaps for rbd_data.d18d71b948ac7.000000000000062e:16.

I read the source code of cache tier of jewel version, and found that, when a object gets evicted from the cache, its obc isn't removed from "object_contexts", only got its obs.oi emptied, and when a "list-snaps" request comes in, the "obs.oi" field isn't checked. Meanwhile, the "snaps" field of "list-snaps" request's reply comes from "obc->obs.oi.snaps". So, I think, when the "list-snaps" request comes in with its target object's clones just evicted from the cache and their obc not removed from "object_contexts" yet, an empty "snaps" for the clones is returned to the client.

Could this be right?

Actions #19

Updated by Xuehan Xu almost 7 years ago

I encountered the same promblem.
I debugged a little, and found that this might have something to do with the "cache evict" and "list_snaps" operations.

I debugged the "core" file of the process with gdb, and confirmed that the object that caused the segmentation fault is rbd_data.d18d71b948ac7.000000000000062e, just as the following logs indicates:

(gdb) f 4
#4 calc_snap_set_diff (cct=<optimized out>, snap_set=..., start=<optimized out>, end=<optimized out>, diff=0x7ffed23a4640, end_size=<optimized out>, end_exists=0x7ffed23a461f)
at librados/snap_set_diff.cc:41
41 a = r->snaps0;
(gdb) p r
$1 = {cloneid = 22, snaps = std::vector of length 0, capacity 0, overlap = std::vector of length 2, capacity 2 = {{first = 0, second = 786432}, {first = 1523712, second = 2670592}},
size = 4194304}
(gdb) f 5
#5 0x00007fa87a4359c4 in compute_diffs (diffs=0x7ffed23a4630, this=0x7fa88f196820) at librbd/DiffIterate.cc:130
130 &end_exists);
(gdb) p m_oid
$2 = "rbd_data.d18d71b948ac7.", '0' <repeats 13 times>, "62e"

Then we checked the cache tier osd's log:

2017-07-14 18:27:11.122472 7f91a365f700 10 osd.58.objecter ms_dispatch 0x7f91e2a9c140 osd_op_reply(2877166 rbd_data.d18d71b948ac7.000000000000062e [copy-get max 8388608] v0'0 uv47138 ondisk = 0) v7
2017-07-14 18:27:11.122514 7f91b395d700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164119 lcod 2160'164120 mlcod 2160'164120 active+clean] process_copy_chunk 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 tid 2877166 (0) Success
2017-07-14 18:27:11.129590 7f91b395d700 10 osd.58.objecter _op_submit oid rbd_data.d18d71b948ac7.000000000000062e '@8' '@8' [assert-version v47138,copy-get max 8388608] tid 2877168 osd.0
2017-07-14 18:27:11.129602 7f91b395d700 1 -- 10.142.121.179:0/24945 --> 10.142.121.142:6824/6246 -- osd_op(osd.58.789:2877168 8.ce3acb8b rbd_data.d18d71b948ac7.000000000000062e [assert-version v47138,copy-get max 8388608] snapc 0=[] ack+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e2160) v7 -- ?+0 0x7f91ee305180 con 0x7f921046e880
2017-07-14 18:27:11.133206 7f91a365f700 1 -- 10.142.121.179:0/24945 <== osd.0 10.142.121.142:6824/6246 149 ==== osd_op_reply(2877168 rbd_data.d18d71b948ac7.000000000000062e [assert-version v47138,copy-get max 8388608] v0'0 uv47138 ondisk = 0) v7 ==== 201+0+119 (2793013310 0 570526743) 0x7f91fb306680 con 0x7f921046e880
2017-07-14 18:27:11.133220 7f91a365f700 10 osd.58.objecter ms_dispatch 0x7f91e2a9c140 osd_op_reply(2877168 rbd_data.d18d71b948ac7.000000000000062e [assert-version v47138,copy-get max 8388608] v0'0 uv47138 ondisk = 0) v7
2017-07-14 18:27:11.133264 7f91b395d700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] process_copy_chunk 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 tid 2877168 (0) Success
2017-07-14 18:27:11.133475 7f91b395d700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] finish_promote 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 r=0 uv47138
2017-07-14 18:27:11.133495 7f91b395d700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] simple_opc_create 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:11.133529 7f91b395d700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] finish_ctx 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 0x7f91eb158000 op promote
2017-07-14 18:27:11.133612 7f91b395d700 7 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164121 (2133'161077,2160'164121] local-les=1977 n=82 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164120 lcod 2160'164120 mlcod 2160'164120 active+clean] issue_repop rep_tid 29670336 o 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:11.133722 7f91b395d700 1 -- 10.143.208.51:6802/3024945 --> 10.143.208.16:6819/4176877 -- osd_repop(osd.58.0:29670336 6.38b 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 v 2160'164122) v1 -- ?+676 0x7f91e84e6600 con 0x7f91f58c2580
2017-07-14 18:27:11.133770 7f91b395d700 1 -- 10.143.208.51:6802/3024945 --> 10.143.208.50:6800/2039335 -- osd_repop(osd.58.0:29670336 6.38b 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 v 2160'164122) v1 -- ?+676 0x7f91e84e8a00 con 0x7f91ebf70280
2017-07-14 18:27:11.133959 7f91b395d700 20 snap_mapper.add_oid 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 b,16
.
.
.
.
.
.
.
.
.
.
.
2017-07-14 18:27:14.583134 7f91b215a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164124 (2133'161077,2160'164124] local-les=1977 n=82 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 luod=2160'164122 lua=2160'164122 crt=2160'164121 lcod 2160'164121 mlcod 2160'164121 active+clean] agent_maybe_evict evicting 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16(2160'164122 osd.58.0:29670336 [16,b] data_digest s 4194304 uv 47138 dd f8fc1a50)
2017-07-14 18:27:14.583143 7f91b215a700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164124 (2133'161077,2160'164124] local-les=1977 n=82 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 luod=2160'164122 lua=2160'164122 crt=2160'164121 lcod 2160'164121 mlcod 2160'164121 active+clean] simple_opc_create 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:14.583209 7f91b215a700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164124 (2133'161077,2160'164124] local-les=1977 n=82 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 luod=2160'164122 lua=2160'164122 crt=2160'164121 lcod 2160'164121 mlcod 2160'164121 active+clean] finish_ctx 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 0x7f91edf66000 op delete
2017-07-14 18:27:14.583242 7f91b215a700 7 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164124 (2133'161077,2160'164124] local-les=1977 n=81 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 luod=2160'164122 lua=2160'164122 crt=2160'164121 lcod 2160'164121 mlcod 2160'164121 active+clean] issue_repop rep_tid 29670402 o 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:14.583270 7f91b215a700 1 -- 10.143.208.51:6802/3024945 --> 10.143.208.16:6819/4176877 -- osd_repop(osd.58.0:29670402 6.38b 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 v 2160'164125) v1 -- ?+248 0x7f91e4b6ac00 con 0x7f91f58c2580
2017-07-14 18:27:14.583390 7f91b215a700 1 -- 10.143.208.51:6802/3024945 --> 10.143.208.50:6800/2039335 -- osd_repop(osd.58.0:29670402 6.38b 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 v 2160'164125) v1 -- ?+248 0x7f91e98c1000 con 0x7f91ebf70280
2017-07-14 18:27:14.583399 7f91b215a700 20 snap_mapper.remove_oid 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:27:14.583562 7f91b215a700 20 snap_mapper.get_snaps 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 b,16
.
.
.
.
.
.
.
.
.
.
2017-07-14 18:28:24.616859 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] do_op osd_op(client.3535109.0:1996 6.ce3acb8b rbd_data.d18d71b948ac7.000000000000062e [list-snaps] snapc 0=[] ack+read+known_if_redirected e2160) v7 may_read -> read-ordered flags ack+read+known_if_redirected
2017-07-14 18:28:24.616876 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] get_object_context: found obc in cache: 0x7f91f94dc280
2017-07-14 18:28:24.616883 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] get_object_context: 0x7f91f94dc280 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head rwstate(none n=0 w=0) oi: 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head(2117'157947 client.596209.0:3574705 dirty s 4194304 uv 157947) ssc: 0x7f92032ca140 snapset: 16=[16,b]:[16]+head
2017-07-14 18:28:24.616894 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] find_object_context 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:snapdir @snapdir oi=6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head(2117'157947 client.596209.0:3574705 dirty s 4194304 uv 157947)
2017-07-14 18:28:24.616905 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] agent_choose_mode flush_mode: idle evict_mode: idle num_objects: 76 num_bytes: 296751474 num_objects_dirty: 38 num_objects_omap: 0 num_dirty: 38 num_user_objects: 72 num_user_bytes: 296751104 pool.info.target_max_bytes: 400000000000 pool.info.target_max_objects: 1000000
2017-07-14 18:28:24.616912 7f91ba16a700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] agent_choose_mode dirty 0.400943 full 0.759682
2017-07-14 18:28:24.616925 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] get_object_context: found obc in cache: 0x7f91f94dc500
2017-07-14 18:28:24.616931 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] get_object_context: 0x7f91f94dc500 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 rwstate(none n=0 w=0) oi: 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16(0'0 unknown.0.0:0 [] s 0 uv 0) ssc: 0x7f92032ca140 snapset: 16=[16,b]:[16]+head
2017-07-14 18:28:24.616941 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] clone_oid 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16 obc 0x7f91f94dc500
2017-07-14 18:28:24.617015 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] execute_ctx 0x7f91e323ea00
2017-07-14 18:28:24.617023 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] do_op 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head [list-snaps] ov 2117'157947
2017-07-14 18:28:24.617030 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] taking ondisk_read_lock
2017-07-14 18:28:24.617036 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] taking ondisk_read_lock for src 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16
2017-07-14 18:28:24.617053 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] do_osd_op 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:head [list-snaps]
2017-07-14 18:28:24.617060 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] do_osd_op list-snaps
2017-07-14 18:28:24.617071 7f91ba16a700 20 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] clone 16 snaps []
2017-07-14 18:28:24.617079 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] dropping ondisk_read_lock
2017-07-14 18:28:24.617182 7f91ba16a700 10 osd.58 pg_epoch: 2160 pg[6.38b( v 2160'164136 (2133'161077,2160'164136] local-les=1977 n=76 ec=279 les/c/f 1977/1977/0 1975/1976/789) [58,46,35] r=0 lpr=1976 crt=2160'164134 lcod 2160'164135 mlcod 2160'164135 active+clean] dropping ondisk_read_lock for src 6:d1d35c73:::rbd_data.d18d71b948ac7.000000000000062e:16

It showed that rbd_data.d18d71b948ac7.000000000000062e:16 got promoted at about 2017-07-14 18:27:11, at which time the "snaps" filed of its object context was still [b,16]. Then, at about 2017-07-14 18:27:14, it got evicted.
And then, at about 2017-07-14 18:28:24, a "list-snaps" request came in, and got the empty snaps for rbd_data.d18d71b948ac7.000000000000062e:16.

I read the source code of cache tier of jewel version, and found that, when a object gets evicted from the cache, its obc isn't removed from "object_contexts", only got its obs.oi emptied, and when a "list-snaps" request comes in, the "obs.oi" field isn't checked. Meanwhile, the "snaps" field of "list-snaps" request's reply comes from "obc->obs.oi.snaps". So, I think, when the "list-snaps" request comes in with its target object's clones just evicted from the cache and their obc not removed from "object_contexts" yet, an empty "snaps" for the clones is returned to the client.

Could this be right?

Actions #20

Updated by Jason Dillaman almost 7 years ago

Definitely sounds like it could be the root-cause to me. Thanks for the investigation help.

Actions #21

Updated by Xuehan Xu almost 7 years ago

It seems that ReplicatedPG::do_op's code of "master" branch has been totally reconstructed, so I submitted a pull request directly to the Jewel branch: https://github.com/ceph/ceph/pull/16355

Actions #22

Updated by Kefu Chai almost 7 years ago

  • Status changed from 12 to Fix Under Review
  • Release set to jewel
Actions #23

Updated by Nathan Cutler almost 7 years ago

Thanks for the jewel-specific fix. Has the bug been declared fixed in master, though?

Actions #24

Updated by Xuehan Xu almost 7 years ago

Sorry, as the related source code has been reconstructed and I haven't test this for the master branch, I can't judge whether this bug exists in master.

Actions #25

Updated by Xuehan Xu almost 7 years ago

But I'm working on it.

Actions #26

Updated by Xuehan Xu almost 7 years ago

Uh, I don't think master branch has this problem. Since "list-snaps"'s result has been moved from ObjectContext::obs.oi.snaps to ObjectContext::ssc.snapset.clone_snaps and the cache eviction only empty ObjectContext::obs.oi, I think even though "list-snaps" get an object context whose oi field is empty, its snapset is still valid.

Actions #27

Updated by Greg Farnum almost 7 years ago

  • Subject changed from rbd cli segfault when doing diff on an image (cache tiering) to jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when doing diff on an image (cache tiering))
  • Release set to kraken
Actions #28

Updated by Xuehan Xu over 6 years ago

Hi, everyone.

I've just add a new list-snaps test, #17017, which can test whether this problem exists in master branch. And the result shows that master doesn't have the same bug.

Sorry for taking so long. I've been very busy in the past month. Please forgive me.

Actions #30

Updated by Nathan Cutler over 6 years ago

  • Tracker changed from Bug to Backport
  • Description updated (diff)
  • Status changed from Fix Under Review to In Progress
  • Priority changed from High to Normal

description

In our ceph cluster some rbd images (create by openstack) make rbd segfault. This is on a ubuntu 14.04 with mitaka cloud archive. There are some Xenial (16.04) machines in the cluster with the same Ceph version. On these machines rbd also segfaults.

$ rbd diff volumes/volume-c1261c3c-b781-4707-b31d-9ac9dd188744
  • Caught signal (Segmentation fault)
    in thread 7fc76755dd80 thread_name:rbd
    ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
    1: (()+0x1f8602) [0x7fc7671c6602]
    2: (()+0x10330) [0x7fc753b71330]
    3: (()+0x4de0c8) [0x7fc75d9920c8]
    4: (()+0x78d81) [0x7fc75d52cd81]
    5: (()+0x5e0f9) [0x7fc75d5120f9]
    6: (()+0x245013) [0x7fc75d6f9013]
    7: (()+0x2457bc) [0x7fc75d6f97bc]
    8: (()+0x7a645) [0x7fc75d52e645]
    9: (()+0xa58b0) [0x7fc75d5598b0]
    10: (rbd::action::diff::execute(boost::program_options::variables_map const&)+0x2e0) [0x7fc767133720]
    11: (rbd::Shell::execute(std::vector<char const*, std::allocator<char const*> > const&)+0x857) [0x7fc76711dbb7]
    12: (main()+0x62) [0x7fc7670ef1f2]
    13: (__libc_start_main()+0xf5) [0x7fc751435f45]
    14: (()+0x13a937) [0x7fc767108937]
    2016-09-30 10:03:25.448690 7fc76755dd80 -1
    Caught signal (Segmentation fault) *
    in thread 7fc76755dd80 thread_name:rbd
ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
1: (()+0x1f8602) [0x7fc7671c6602]
2: (()+0x10330) [0x7fc753b71330]
3: (()+0x4de0c8) [0x7fc75d9920c8]
4: (()+0x78d81) [0x7fc75d52cd81]
5: (()+0x5e0f9) [0x7fc75d5120f9]
6: (()+0x245013) [0x7fc75d6f9013]
7: (()+0x2457bc) [0x7fc75d6f97bc]
8: (()+0x7a645) [0x7fc75d52e645]
9: (()+0xa58b0) [0x7fc75d5598b0]
10: (rbd::action::diff::execute(boost::program_options::variables_map const&)+0x2e0) [0x7fc767133720]
11: (rbd::Shell::execute(std::vector&lt;char const*, std::allocator&lt;char const*&gt; > const&)+0x857) [0x7fc76711dbb7]
12: (main()+0x62) [0x7fc7670ef1f2]
13: (__libc_start_main()+0xf5) [0x7fc751435f45]
14: (()+0x13a937) [0x7fc767108937]
NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.

Objdump is to large for this bug tracker. I can provide this by email. The full segfault message is attached to this issue.

Actions #31

Updated by Nathan Cutler over 6 years ago

  • Release deleted (kraken)
Actions #32

Updated by Nathan Cutler over 6 years ago

Note: bug is not present in master, as demonstrated by https://github.com/ceph/ceph/pull/17017

Actions #34

Updated by Nathan Cutler over 6 years ago

  • Description updated (diff)
Actions #35

Updated by Nathan Cutler over 6 years ago

  • Assignee set to Sage Weil
Actions #36

Updated by Jason Dillaman over 6 years ago

Actions #37

Updated by Nathan Cutler about 6 years ago

  • Status changed from In Progress to Resolved
  • Target version set to v10.2.11
Actions

Also available in: Atom PDF