Project

General

Profile

Actions

Bug #16211

closed

Some rbd images inaccessible after upgrade to jewel (error reading immutable metadata)

Added by David Hedberg almost 8 years ago. Updated almost 6 years ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We're running a small cluster with three nodes. Each node has one monitor and three osds running with XFS formatted disks and a SSD backed journal (~1GB each).

I recently upgraded these from (Ubuntu) 12.04/0.94.6-1precise through 12.04/0.94.7-1precise through 14.04/0.94.7-1trusty to 14.04/10.2.1-1trusty, which left me with a bunch of inaccessible rbd images. "rbd info" fails on 16 out of ~80 images (some of them presumably due to their unreadable parent).

Example:
-------

# rbd info ubuntu1204_base
2016-06-09 10:16:26.376164 7ffb8b3bb700 -1 librbd::image::OpenRequest: failed to retreive immutable metadata: (2) No such file or directory
rbd: error opening image ubuntu1204_base: (2) No such file or directory

# rbd info ubuntu1204_base --debug-ms 1
...
2016-06-09 10:16:57.720301 7fa6a2dfb700  1 -- 10.56.5.19:0/1307685403 <== osd.4 10.56.5.19:6802/5976 1 ==== osd_op_reply(3 rbd_header.4080512ae8944a [call,call] v0'0 uv156486030 ondisk = -2 ((2) No such file or directory)) v7 ==== 187+0+0 (1510277227 0 0) 0x7fa680000a60 con 0x7fa6880060e0
...

# rados -p rbd listomapvals rbd_header.4080512ae8944a
(no output)

Details:
-------

I suspect the upgrade to jewel is the source of the problem. All upgrades were rolling, performed by installing the new packages, restarting all the monitors one by one and then restarting all the osds, node by node. osd noout was set during through all the upgrades. When upgrading 12.04 -> 14.04 I installed the trusty packages before rebooting the servers after finishing the upgrade.

(When upgrading from 0.94.6-1precise to 0.94.7-1precise I hit a presumably unrelated bug that kept segfaulting the monitors. This seems to have been related to an old version of radosgw running on another server. The monitors went stable when I turned it off, but now radosgw itself seems to segfault in the same way and won't start. I have not looked into this further.)

When upgrading to jewel there was a significant period of time when both hammer and jewel osds were active, as I took some time to run chown on the files. I don't know when the problem started, but I think that the number of inaccessible rbd images went up during this procedure.

The images are all used in virtual machines (qemu-kvm) and the clients should all be firefly, hammer or jewel.

The tunables were set to argonaut (if I understood the output correctly), but I later updated them to firefly. I did this after the problem had occurred, however.

Issue #15561 might be related.


Files

rbd_debug.txt (6.88 KB) rbd_debug.txt rbd info ubuntu1204_base --debug-rbd=20 --debug-ms=1 David Hedberg, 06/09/2016 01:04 PM
emptyomapvals.txt (2.64 KB) emptyomapvals.txt David Hedberg, 06/09/2016 01:44 PM
Actions #1

Updated by Jason Dillaman almost 8 years ago

  • Status changed from New to Need More Info

@David: can you re-run the following and provide the debug log output:

rbd info ubuntu1204_base --debug-rbd=20 --debug-ms=1
Actions #2

Updated by David Hedberg almost 8 years ago

Output attached.

Actions #3

Updated by Jason Dillaman almost 8 years ago

@David: can you run "rados -p rbd ls | grep rbd_header. | xargs -n 1 bash -c 'echo OBJECT: $0 ; rados -p rbd listomapvals $0'". I am wondering how many RBD images have zero omap key/values.

Actions #4

Updated by David Hedberg almost 8 years ago

I filtered it through "grep '^\(OBJECT\|features\)'" and got the attached output.

I manually reran listomapvals for all headers lacking a corresponding feature-line (I count 14) and got no output for any of them.

One of the images I tried to analyze yesterday failed, as far as I could tell, when trying to look up information about its parent. I guess that could explain the discrepancy (16 failing images).

Actions #5

Updated by Jason Dillaman almost 8 years ago

@David: how's your cluster's health? Can you run "ceph osd map rbd <header object>" against the 14 headers with missing omap values? Any pattern present in which PGs have missing omap values vs ones don't?

Actions #6

Updated by Jason Dillaman almost 8 years ago

@David: any chance you are using cache tiering?

Actions #7

Updated by Samuel Just almost 8 years ago

At what version did this cluster start? Jason, I think the problem is that the FLAG_OMAP on the object info is set to false erroneously, and we recently started checking it in the omap read path. We can force it to true by writing a single omap value to the object. Is there a harmless key you can write that won't collide with anything used for rbd?

Actions #8

Updated by David Hedberg almost 8 years ago

Jason Dillaman wrote:

@David: how's your cluster's health?

Output from ceph status:

    cluster e280d4c9-b2b5-48f1-81c8-c55ecda20e96
     health HEALTH_WARN
            pool .rgw.buckets has many more objects per pg than average (too few pgs?)
     monmap e9: 3 mons at {k2=10.56.5.19:6789/0,k8=10.56.5.174:6789/0,k9=10.56.5.46:6789/0}
            election epoch 3978, quorum 0,1,2 k2,k9,k8
     osdmap e629434: 9 osds: 9 up, 9 in
            flags sortbitwise
      pgmap v75790167: 392 pgs, 13 pools, 1730 GB data, 562 kobjects
            5462 GB used, 11289 GB / 16752 GB avail
                 392 active+clean
  client io 0 B/s rd, 101740 B/s wr, 3 op/s rd, 10 op/s wr

Can you run "ceph osd map rbd <header object>" against the 14 headers with missing omap values? Any pattern present in which PGs have missing omap values vs ones don't?

The OSD distribution doesn't look very random, but I'm not sure if that means anything:

rbd_header.b1fff2ae8944a:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.b1fff2ae8944a' -> pg 2.191ce0c9 (2.9) -> up ([0,8,3], p0) acting ([0,8,3], p0)
rbd_header.5353c62ae8944a:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.5353c62ae8944a' -> pg 2.5298f9ca (2.a) -> up ([5,4,6], p5) acting ([5,4,6], p5)
rbd_header.ef252ae8944a:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.ef252ae8944a' -> pg 2.f12750cc (2.c) -> up ([0,5,3], p0) acting ([0,5,3], p0)
rbd_header.5340fb2ae8944a:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.5340fb2ae8944a' -> pg 2.b2bdf6d6 (2.16) -> up ([5,6,1], p5) acting ([5,6,1], p5)
rbd_header.531cdf3d1b58ba:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.531cdf3d1b58ba' -> pg 2.e3324363 (2.23) -> up ([0,3,8], p0) acting ([0,3,8], p0)
rbd_header.5421983d1b58ba:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.5421983d1b58ba' -> pg 2.ef09f9a4 (2.24) -> up ([4,2,7], p4) acting ([4,2,7], p4)
rbd_header.44846c2ae8944a:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.44846c2ae8944a' -> pg 2.a7b459e6 (2.26) -> up ([2,0,4], p2) acting ([2,0,4], p2)
rbd_header.5103163d1b58ba:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.5103163d1b58ba' -> pg 2.f85abe30 (2.30) -> up ([2,1,6], p2) acting ([2,1,6], p2)
rbd_header.53023b2ae8944a:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.53023b2ae8944a' -> pg 2.28cd69f0 (2.30) -> up ([2,1,6], p2) acting ([2,1,6], p2)
rbd_header.b20a174b0dc51:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.b20a174b0dc51' -> pg 2.adfc4539 (2.39) -> up ([4,6,5], p4) acting ([4,6,5], p4)
rbd_header.531ce23d1b58ba:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.531ce23d1b58ba' -> pg 2.c302997b (2.3b) -> up ([4,0,5], p4) acting ([4,0,5], p4)
rbd_header.4080512ae8944a:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.4080512ae8944a' -> pg 2.bddd9d7b (2.3b) -> up ([4,0,5], p4) acting ([4,0,5], p4)
rbd_header.5421923d1b58ba:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.5421923d1b58ba' -> pg 2.7bb79d3c (2.3c) -> up ([5,7,4], p5) acting ([5,7,4], p5)
rbd_header.54c46b238e1f29:
osdmap e629434 pool 'rbd' (2) object 'rbd_header.54c46b238e1f29' -> pg 2.a2a6ea3f (2.3f) -> up ([4,0,5], p4) acting ([4,0,5], p4)

osd 1, 3, 4 is on k2; osd 2, 5, 8 is on k8; osd 0, 6, 7 is on k9

@David: any chance you are using cache tiering?

No tiering enabled, all the osd disks are the same model (except the small journals).

Actions #9

Updated by Jason Dillaman almost 8 years ago

@David: as per Sam's suggestion, can you run the following on one rbd_header object to test:

rados --pool rbd setomapval <rbd_header> "dummy_key" "dummy_val"
rados --pool rbd rmomapkey <rbd_header> "dummy_key"
rados --pool rbd listomapvals <rbd_header>

Actions #10

Updated by Jason Dillaman almost 8 years ago

... and if this does fix you problem, can you leave at least one rbd_header file in a broken state so that we can pull some raw data from it to figure out the root-cause?

Actions #11

Updated by Samuel Just almost 8 years ago

I think the right answer for now is to remove reliance on FLAG_OMAP for anything other than its original purpose: determining whether it's safe to demote an object to an ec backing pool. wip-16211[-jewel]

Actions #12

Updated by David Hedberg almost 8 years ago

Nice, that does seem to fix it:

# rados -p rbd listomapvals rbd_header.5421983d1b58ba
# rados --pool rbd setomapval rbd_header.5421983d1b58ba "dummy_key" "dummy_val" 
# rados --pool rbd rmomapkey rbd_header.5421983d1b58ba "dummy_key" 
# rados --pool rbd listomapvals rbd_header.5421983d1b58ba
features
value (8 bytes) :
00000000  01 00 00 00 00 00 00 00                           |........|
00000008

object_prefix
...etc

There's no real hurry to restore most of them, so tell me what you need to figure out the cause.

Actions #13

Updated by David Hedberg almost 8 years ago

Samuel Just wrote:

At what version did this cluster start?

If still relevant: I don't really remember, but I found an old ceph-deploy log from 2013-07-22 mentioning cuttlefish which seems reasonable. Maybe there's a better way to find out?

Actions #14

Updated by Samuel Just almost 8 years ago

If we can get a complete log of which versions osds on this cluster have run at, that would help narrow down where the buggy object_info values came from -- I suspect they are pretty old.

Actions #15

Updated by David Hedberg almost 8 years ago

We don't have any complete logs of that kind stretching back far enough, sadly, so I can't really help.

All of the affected images could be from the first half of 2014. We probably upgraded to 0.80.x fairly early, as it shipped with Ubuntu 14.04 which we use in another cluster. It's unlikely that we ever ran anything not considered stable.

Logs from 2015 and later:

ceph-common:amd64 (0.80.8-1precise, 0.80.9-1precise)
ceph-common:amd64 (0.80.9-1precise, 0.80.10-1precise)
ceph-common:amd64 (0.80.10-1precise, 0.80.11-1precise)
ceph-common:amd64 (0.80.11-1precise, 0.94.5-1precise)
ceph-common:amd64 (0.94.5-1precise, 0.94.6-1precise)
ceph-common:amd64 (0.94.6-1precise, 0.94.7-1precise)
ceph-common:amd64 (0.94.7-1precise, 0.94.7-1trusty)
ceph-common:amd64 (0.94.7-1trusty, 10.2.1-1trusty)

Actions #16

Updated by Samuel Just almost 8 years ago

Sorry, didn't mean a complete osd log, just a record of which versions you ran so I can try to pin down which firefly (probably) point release had the bug. Mostly, I'd like to work out how widespread this is likely to be. My current plan is for the next jewel point release to go back to ignoring the flag for most purposes, and for scrub in master to start verifying it.

Actions #17

Updated by David Hedberg almost 8 years ago

Sorry for my confusing wording, what I meant is that I don't know of any useful logs at all. We never used ceph-deploy beyond setting up the cluster (if that), and our earliest apt/history.log is from September 2015. I guess ceph doesn't keep anything internally?

Actions #18

Updated by Mykola Golub almost 8 years ago

I think Sam meant ceph-osd logs. On ceph daemon start it logs its version, something like below:

2016-06-10 15:35:00.146069 7f040de82800 0 ceph version 10.2.0-2119-g5c2ec0e (5c2ec0eb4191c1f59ae98426f6844d5d9351ca23), process ceph-osd, pid 27776

Actions #19

Updated by David Hedberg almost 8 years ago

Sure, but thanks to the wonders of the default logrotate configuration for those we only have one week worth of osd logs. :) We do not keep a backup of the logs on these servers.

Actions #20

Updated by David Zafman almost 8 years ago

If there is an object which is still misbehaving you can use these steps to get a binary dump of the object_info_t for the object. We might be able to determine what version of Ceph wrote the object info.


$ gdb ceph-objectstore-tool
GNU gdb (...) #.#.#
...
Reading symbols from ./ceph-objectstore-tool...done.
(gdb) break object_info_t::dump
Breakpoint 1 at 0x63e910: file osd/osd_types.cc, line 4815.
(gdb) run --data-path ...... rbd_header.############### dump
Starting program: /src/ceph/src/ceph-objectstore-tool --data-path ...... rbd_header.############### dump
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff4496700 (LWP 41579)]
[New Thread 0x7ffff351b700 (LWP 41580)]
[New Thread 0x7ffff2d1a700 (LWP 41581)]
[New Thread 0x7ffff2519700 (LWP 41582)]
[New Thread 0x7ffff1d18700 (LWP 41583)]
[New Thread 0x7ffff1517700 (LWP 41584)]
[New Thread 0x7ffff0d16700 (LWP 41585)]
[New Thread 0x7fffdbfff700 (LWP 41586)]
[New Thread 0x7fffdb7fe700 (LWP 41587)]
[New Thread 0x7fffdaffd700 (LWP 41588)]
[New Thread 0x7fffda7fc700 (LWP 41589)]
[New Thread 0x7fffd9ffb700 (LWP 41590)]

Breakpoint 1, object_info_t::dump (this=0x7fffffffb2b0, f=0x55555e845660) at osd/osd_types.cc:4815
4815    {
(gdb) print *this
$1 = {soid = {oid = {name = "rbd_header.###############"}, snap = {val = 18446744073709551614}, hash = 3043092879, max = false, nibblewise_key_cache = 4171109979, hash_reverse_bits = 4053239469, static POOL_META = -1, static POOL_TEMP_START = -2, pool = 1, nspace = "", key = ""}, version = {version = 7, epoch = 18, __pad = 0}, prior_version = {
    version = 0, epoch = 0, __pad = 0}, user_version = 7, last_reqid = {name = {_type = 8 '\b', _num = 4125, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static NEW = -1}, tid = 41, inc = 0}, size = 4194304, mtime = {tv = {tv_sec = 1465604840, tv_nsec = 162560361}}, local_mtime = {tv = {tv_sec = 1465604840,
      tv_nsec = 714185371}}, flags = (object_info_t::FLAG_DIRTY | object_info_t::FLAG_DATA_DIGEST | object_info_t::FLAG_OMAP_DIGEST), snaps = std::vector of length 0, capacity 0, truncate_seq = 0, truncate_size = 0, watchers = std::map with 0 elements, data_digest = 2197221340, omap_digest = 4294967295}
(gdb) x/256xb this
0x7fffffffb2b0: 0xb8    0xbc    0x87    0x5e    0x55    0x55    0x00    0x00
0x7fffffffb2b8: 0xfe    0xff    0xff    0xff    0xff    0xff    0xff    0xff
0x7fffffffb2c0: 0x8f    0xe9    0x61    0xb5    0x00    0x00    0x00    0x00
0x7fffffffb2c8: 0x5b    0x16    0x9e    0xf8    0xad    0x86    0x97    0xf1
0x7fffffffb2d0: 0x01    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb2d8: 0xd8    0x53    0x8b    0xf5    0xff    0x7f    0x00    0x00
0x7fffffffb2e0: 0xd8    0x53    0x8b    0xf5    0xff    0x7f    0x00    0x00
0x7fffffffb2e8: 0x07    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb2f0: 0x12    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb2f8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb300: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb308: 0x07    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb310: 0x08    0x00    0x00    0x00    0x01    0x00    0x00    0x00
0x7fffffffb318: 0x1d    0x10    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb320: 0x29    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb328: 0x00    0x00    0x00    0x00    0xff    0x7f    0x00    0x00
0x7fffffffb330: 0x00    0x00    0x40    0x00    0x00    0x00    0x00    0x00
0x7fffffffb338: 0xe8    0x5a    0x5b    0x57    0x69    0x79    0xb0    0x09
0x7fffffffb340: 0xe8    0x5a    0x5b    0x57    0x9b    0x9a    0x91    0x2a
0x7fffffffb348: 0x34    0x00    0x00    0x00    0xff    0x7f    0x00    0x00
0x7fffffffb350: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb358: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb360: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb368: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb370: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb378: 0x81    0xfa    0x61    0xf5    0xff    0x7f    0x00    0x00
0x7fffffffb380: 0x00    0x00    0x00    0x00    0xff    0x7f    0x00    0x00
0x7fffffffb388: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb390: 0x80    0xb3    0xff    0xff    0xff    0x7f    0x00    0x00
0x7fffffffb398: 0x80    0xb3    0xff    0xff    0xff    0x7f    0x00    0x00
0x7fffffffb3a0: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffb3a8: 0xdc    0xef    0xf6    0x82    0xff    0xff    0xff    0xff
(gdb) x/64w this
0x7fffffffb2b0: 0x5e87bcb8      0x00005555      0xfffffffe      0xffffffff
0x7fffffffb2c0: 0xb561e98f      0x00000000      0xf89e165b      0xf19786ad
0x7fffffffb2d0: 0x00000001      0x00000000      0xf58b53d8      0x00007fff
0x7fffffffb2e0: 0xf58b53d8      0x00007fff      0x00000007      0x00000000
0x7fffffffb2f0: 0x00000012      0x00000000      0x00000000      0x00000000
0x7fffffffb300: 0x00000000      0x00000000      0x00000007      0x00000000
0x7fffffffb310: 0x00000008      0x00000001      0x0000101d      0x00000000
0x7fffffffb320: 0x00000029      0x00000000      0x00000000      0x00007fff
0x7fffffffb330: 0x00400000      0x00000000      0x575b5ae8      0x09b07969
0x7fffffffb340: 0x575b5ae8      0x2a919a9b      0x00000034      0x00007fff
0x7fffffffb350: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffffffb360: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fffffffb370: 0x00000000      0x00000000      0xf561fa81      0x00007fff
0x7fffffffb380: 0x00000000      0x00007fff      0x00000000      0x00000000
0x7fffffffb390: 0xffffb380      0x00007fff      0xffffb380      0x00007fff
0x7fffffffb3a0: 0x00000000      0x00000000      0x82f6efdc      0xffffffff

Actions #21

Updated by David Hedberg almost 8 years ago

Here is the same for one of the broken objects.

# gdb ceph-objectstore-tool
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
...
(gdb) b object_info_t::dump
Breakpoint 1 at 0x555555b83f40: file osd/osd_types.cc, line 4819.
(gdb) run --data-path /var/lib/ceph/osd/ceph-4 rbd_header.4080512ae8944a dump
Starting program: /usr/bin/ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-4 rbd_header.4080512ae8944a dump
...
Breakpoint 1, object_info_t::dump (this=0x7fffffffba30, f=0x55555fe5a000) at osd/osd_types.cc:4819
4819    osd/osd_types.cc: No such file or directory.
(gdb) print *this
$1 = {soid = {oid = {name = "rbd_header.4080512ae8944a"}, snap = {val = 18446744073709551614}, hash = 3185417595, max = false, nibblewise_key_cache = 3084508635, hash_reverse_bits = 3736714173, static POOL_META = -1,
    static POOL_TEMP_START = -2, pool = 2, nspace = "", key = ""}, version = {version = 138132925, epoch = 561883, __pad = 0}, prior_version = {version = 2483337, epoch = 14516, __pad = 0}, user_version = 2483336, last_reqid = {name = {
      _type = 4 '\004', _num = 4, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static NEW = -1}, tid = 1200185, inc = 0}, size = 0, mtime = {tv = {tv_sec = 1377260765, tv_nsec = 268995000}},
  local_mtime = {tv = {tv_sec = 0, tv_nsec = 0}}, flags = (object_info_t::FLAG_DATA_DIGEST | object_info_t::FLAG_OMAP_DIGEST), snaps = std::vector of length 0, capacity 0, truncate_seq = 0, truncate_size = 0,
  watchers = std::map with 0 elements, data_digest = 4294967295, omap_digest = 372627317}
(gdb) x/256xb this
0x7fffffffba30:    0xd8    0x3f    0xf4    0x5f    0x55    0x55    0x00    0x00
0x7fffffffba38:    0xfe    0xff    0xff    0xff    0xff    0xff    0xff    0xff
0x7fffffffba40:    0x7b    0x9d    0xdd    0xbd    0x00    0x7f    0x00    0x00
0x7fffffffba48:    0xdb    0xdd    0xd9    0xb7    0xbd    0xbb    0xb9    0xde
0x7fffffffba50:    0x02    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffba58:    0xd8    0xa3    0x85    0xf5    0xff    0x7f    0x00    0x00
0x7fffffffba60:    0xd8    0xa3    0x85    0xf5    0xff    0x7f    0x00    0x00
0x7fffffffba68:    0xbd    0xbd    0x3b    0x08    0x00    0x00    0x00    0x00
0x7fffffffba70:    0xdb    0x92    0x08    0x00    0x00    0x00    0x00    0x00
0x7fffffffba78:    0x89    0xe4    0x25    0x00    0x00    0x00    0x00    0x00
0x7fffffffba80:    0xb4    0x38    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffba88:    0x88    0xe4    0x25    0x00    0x00    0x00    0x00    0x00
0x7fffffffba90:    0x04    0x13    0x84    0xf5    0xff    0x7f    0x00    0x00
0x7fffffffba98:    0x04    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbaa0:    0x39    0x50    0x12    0x00    0x00    0x00    0x00    0x00
0x7fffffffbaa8:    0x00    0x00    0x00    0x00    0xff    0x7f    0x00    0x00
0x7fffffffbab0:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbab8:    0xdd    0x54    0x17    0x52    0xb8    0x89    0x08    0x10
0x7fffffffbac0:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbac8:    0x30    0x00    0x00    0x00    0xff    0x7f    0x00    0x00
0x7fffffffbad0:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbad8:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbae0:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbae8:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbaf0:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbaf8:    0x81    0x4a    0x5c    0xf5    0xff    0x7f    0x00    0x00
0x7fffffffbb00:    0x00    0x00    0x00    0x00    0xff    0x7f    0x00    0x00
0x7fffffffbb08:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbb10:    0x00    0xbb    0xff    0xff    0xff    0x7f    0x00    0x00
0x7fffffffbb18:    0x00    0xbb    0xff    0xff    0xff    0x7f    0x00    0x00
0x7fffffffbb20:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7fffffffbb28:    0xff    0xff    0xff    0xff    0x75    0xd7    0x35    0x16
(gdb) x/64w this
0x7fffffffba30:    0x5ff43fd8    0x00005555    0xfffffffe    0xffffffff
0x7fffffffba40:    0xbddd9d7b    0x00007f00    0xb7d9dddb    0xdeb9bbbd
0x7fffffffba50:    0x00000002    0x00000000    0xf585a3d8    0x00007fff
0x7fffffffba60:    0xf585a3d8    0x00007fff    0x083bbdbd    0x00000000
0x7fffffffba70:    0x000892db    0x00000000    0x0025e489    0x00000000
0x7fffffffba80:    0x000038b4    0x00000000    0x0025e488    0x00000000
0x7fffffffba90:    0xf5841304    0x00007fff    0x00000004    0x00000000
0x7fffffffbaa0:    0x00125039    0x00000000    0x00000000    0x00007fff
0x7fffffffbab0:    0x00000000    0x00000000    0x521754dd    0x100889b8
0x7fffffffbac0:    0x00000000    0x00000000    0x00000030    0x00007fff
0x7fffffffbad0:    0x00000000    0x00000000    0x00000000    0x00000000
0x7fffffffbae0:    0x00000000    0x00000000    0x00000000    0x00000000
0x7fffffffbaf0:    0x00000000    0x00000000    0xf55c4a81    0x00007fff
0x7fffffffbb00:    0x00000000    0x00007fff    0x00000000    0x00000000
0x7fffffffbb10:    0xffffbb00    0x00007fff    0xffffbb00    0x00007fff
0x7fffffffbb20:    0x00000000    0x00000000    0xffffffff    0x1635d775
(gdb) c
Continuing.
{
    "id": {
        "oid": "rbd_header.4080512ae8944a",
        "key": "",
        "snapid": -2,
        "hash": 3185417595,
        "max": 0,
        "pool": 2,
        "namespace": "",
        "max": 0
    },
    "info": {
        "oid": {
            "oid": "rbd_header.4080512ae8944a",
            "key": "",
            "snapid": -2,
            "hash": 3185417595,
            "max": 0,
            "pool": 2,
            "namespace": "" 
        },
        "version": "561883'138132925",
        "prior_version": "14516'2483337",
        "last_reqid": "osd.4.0:1200185",
        "user_version": 2483336,
        "size": 0,
        "mtime": "2013-08-23 14:26:05.268995",
        "local_mtime": "0.000000",
        "lost": 0,
        "flags": 48,
        "snaps": [],
        "truncate_seq": 0,
        "truncate_size": 0,
        "data_digest": 4294967295,
        "omap_digest": 372627317,
        "watchers": {}
    },
    "stat": {
        "size": 0,
        "blksize": 4096,
        "blocks": 8,
        "nlink": 1
    },
    "SnapSet": {
        "snap_context": {
            "seq": 0,
            "snaps": []
        },
        "head_exists": 1,
        "clones": []
    }
}
Actions #22

Updated by Jason Dillaman over 7 years ago

  • Project changed from rbd to Ceph
Actions #23

Updated by Jason Dillaman over 7 years ago

  • Assignee set to Samuel Just
Actions #24

Updated by Josh Durgin almost 7 years ago

  • Status changed from Need More Info to Won't Fix

It sounds like this was caused by a firefly or older cluster, is not widespread, and has a simple workaround. It doesn't seem worth adding a workaround for this in the current code.

Actions #25

Updated by Eneko Lacunza over 6 years ago

Josh Durgin wrote:

It sounds like this was caused by a firefly or older cluster, is not widespread, and has a simple workaround. It doesn't seem worth adding a workaround for this in the current code.

We just hit this problem upgrading from jewel to luminous. Creating a dummy key/value fixes the problem.

This affected 19 of our VMs - no backups this weekend... we're using ceph since firefly; upgrade patch was hammer, jewel and now luminous.

Actions #26

Updated by Wido den Hollander about 6 years ago

I ran into this issue as well while upgrading a cluster from Jewel to Luminous.

This cluster was initially installed with Dumpling and then upgraded to Firefly, Hammer, Jewel and now Luminous.

Some clients were still Hammer and we suddenly saw this happen.

I found a blog post which helped me (and this issue): http://fnordahl.com/2017/04/17/ceph-rbd-volume-header-recovery/

In the end I wrote a small script:

#!/bin/bash
set -e

POOL=$1
IMAGE=$2

ID=$(rados -p $POOL get rbd_id.$IMAGE -|strings)

rados -p $POOL setomapval rbd_header.${ID} "dummy_key" "dummy_val" 

rbd -p $POOL info $IMAGE
root@mon01:~# ./fix-rbd-header.sh rbd-c02-p02 fe9a265a-66e3-4594-91c1-946e37e78852
rbd image 'fe9a265a-66e3-4594-91c1-946e37e78852':
    size 20480 MB in 5120 objects
    order 22 (4096 kB objects)
    block_name_prefix: rbd_data.a479b4af43f45
    format: 2
    features: layering
    flags: 
root@mon01:~#
Actions #27

Updated by sean redmond almost 6 years ago

Just had a case of running into this also when upgrading clients from jewel to luminous - cluster has been upgraded from firefly, hammer, jewel and luminous.

Actions #28

Updated by sean redmond almost 6 years ago

sean redmond wrote:

Just had a case of running into this also when upgrading clients from jewel to luminous - cluster has been upgraded from firefly, hammer, jewel and luminous.

Wido the script work a treat ;)

Actions #29

Updated by Dan van der Ster almost 6 years ago

+1 we joined this club today. Setting then rm'ing the dummy_key revived the omap for our first such header.

I will now scrub through all of our images.

Actions

Also available in: Atom PDF