Project

General

Profile

Actions

Bug #7577

closed

rbd info displays extra random char in block prefix

Added by Dan van der Ster about 10 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The rbd cli in dumpling 0.67.7 displays an extra random char at the end of the block prefix string:

[root@p01001532021656 ~]# rbd --version
ceph version 0.67.7 (d7ab4244396b57aac8b7e80812115bbd079e6b73)
[root@p01001532021656 ~]# rbd info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a:
format: 2
features: layering
[root@p01001532021656 ~]# rbd info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a?
format: 2
features: layering
[root@p01001532021656 ~]# rbd info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a?
format: 2
features: layering
[root@p01001532021656 ~]# rbd info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944aY
format: 2
features: layering
[root@p01001532021656 ~]# rbd info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a?
format: 2
features: layering

0.67.4 doesn't have this behaviour:

dvanders@dvanders-hpi5:~$ rbd --version
ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
dvanders@dvanders-hpi5:~$ rbd info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a
format: 2
features: layering
dvanders@dvanders-hpi5:~$ rbd info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a
format: 2
features: layering
dvanders@dvanders-hpi5:~$ rbd info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a
format: 2
features: layering

I don't when this bug started, just noticed it now.

Actions #1

Updated by Ian Colle about 10 years ago

  • Assignee set to Josh Durgin
Actions #2

Updated by Danny Al-Gaaf about 10 years ago

Tried to reproduce, but couldn't. Looks to me like missing '\0' termination. Do you get this from all of your 0.67.7 machines?

Actions #3

Updated by Dan van der Ster about 10 years ago

My earlier diagnsis 0.67.4 vs 0.67.7 was incorrect -- actually that was a el6 vs ubuntu difference.

On my ubuntu 0.67.7 client, I cannot reproduce.

On all of my el6 clients, I can reproduce.

I will work backwords now to see if this was introduced sometime in dumpling or if it's something else.

Actions #4

Updated by Dan van der Ster about 10 years ago

Dan van der Ster wrote:

My earlier diagnsis 0.67.4 vs 0.67.7 was incorrect -- actually that was a el6 vs ubuntu difference.

On my ubuntu 0.67.7 client, I cannot reproduce.

On all of my el6 0.67.7 clients, I can reproduce.

I will work backwords now to see if this was introduced sometime in dumpling or if it's something else.

Actions #5

Updated by Dan van der Ster about 10 years ago

So this is not a recent regression. I can reproduce on el6.5 with 0.67-0

[root@dvtest1 ceph]# ceph --version
ceph version 0.67 (e3b7bc5bce8ab330ec1661381072368af3c218a0)
[root@dvtest1 ceph]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
2014-03-11 14:39:54.963784 7fcfa8005760 -1 did not load config file, using default settings.
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a?
format: 2
features: layering
[root@dvtest1 ceph]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
2014-03-11 14:39:55.723764 7fe36039e760 -1 did not load config file, using default settings.
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a?
format: 2
features: layering
[root@dvtest1 ceph]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
2014-03-11 14:39:56.372348 7f27c13fe760 -1 did not load config file, using default settings.
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a'
format: 2
features: layering
[root@dvtest1 ceph]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
2014-03-11 14:39:56.900712 7fe11152e760 -1 did not load config file, using default settings.
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a?
format: 2
features: layering
[root@dvtest1 ceph]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b
2014-03-11 14:39:57.491803 7fe7579f5760 -1 did not load config file, using default settings.
rbd image 'volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b':
size 5120 MB in 1280 objects
order 22 (4096 KB objects)
block_name_prefix: rbd_data.1000cf52ae8944a?
format: 2
features: layering

Actions #6

Updated by Dan van der Ster about 10 years ago

Here's a bit more with json format:

{"name":"volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b","size":5368709120,"objects":1280,"order":22,"object_size":4194304,"block_name_prefix":"rbd_data.1000cf52ae8944a?\u007f","format":2,"features":["layering","striping"]}[root@dvtest1 ~]#
[root@dvtest1 ~]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b --format json {"name":"volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b","size":5368709120,"objects":1280,"order":22,"object_size":4194304,"block_name_prefix":"rbd_data.1000cf52ae8944a?\u007f","format":2,"features":["layering","striping"]}[root@dvtest1 ~]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b --format json {"name":"volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b","size":5368709120,"objects":1280,"order":22,"object_size":4194304,"block_name_prefix":"rbd_data.1000cf52ae8944aP\u007f","format":2,"features":["layering","striping"]}[root@dvtest1 ~]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b --format json {"name":"volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b","size":5368709120,"objects":1280,"order":22,"object_size":4194304,"block_name_prefix":"rbd_data.1000cf52ae8944a\\\u007f","format":2,"features":["layering","striping"]}[root@dvtest1 ~]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b --format json {"name":"volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b","size":5368709120,"objects":1280,"order":22,"object_size":4194304,"block_name_prefix":"rbd_data.1000cf52ae8944a?\u007f","format":2,"features":["layering","striping"]}[root@dvtest1 ~]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b --format json {"name":"volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b","size":5368709120,"objects":1280,"order":22,"object_size":4194304,"block_name_prefix":"rbd_data.1000cf52ae8944ae\u007f","format":2,"features":["layering","striping"]}[root@dvtest1 ~]# rbd -m cephmon info volumes/volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b --format json {"name":"volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b","size":5368709120,"objects":1280,"order":22,"object_size":4194304,"block_name_prefix":"rbd_data.1000cf52ae8944a?\u007f","format":2,"features":["layering","striping"]}[root@dvtest1 ~]#

Actions #7

Updated by Ian Colle about 10 years ago

  • Priority changed from High to Urgent
Actions #8

Updated by Dan Mick about 10 years ago

Sent this in email, but: u007f sounds like maybe a 0x7f character?.. Which should be legal 7-bit ASCII, but perhaps in some other encoding, it's not? Ceph isn't
very careful about Unicode.

Actions #9

Updated by Dan van der Ster about 10 years ago

I'm not current on my c++, but I'm pretty sure Danny's correct. Looking at the code, rbd.cc just does a cout of the 24-byte not-null-terminated char array. Perhaps that's not safe on rhel?

Actions #10

Updated by Dan Mick about 10 years ago

True, I don't see any attempt to add a nul. Oddly, it looks like there was half a plan to do so in librbd/internal.cc:image_info, but no actual NUL placed, and
it won't work for exactly-RBD_MAX_BLOCK_NAME_SIZE names anyway. blech.

Actions #11

Updated by Dan Mick about 10 years ago

Why I feel some guilt here: :)

commit 9a45ffb769c7c821196a8471009d0f9a4216c0d4
Author: Dan Mick <>

Actions #12

Updated by Dan van der Ster about 10 years ago

I just rebuilt from current git dumpling head on my machine -- cannot reproduce. So I wonder this is something subtle about your build env vs. my running env.

Actions #13

Updated by Danny Al-Gaaf about 10 years ago

Dan Mick wrote:

Why I feel some guilt here: :)

commit 9a45ffb769c7c821196a8471009d0f9a4216c0d4
Author: Dan Mick <>

I've seen this commit. At my setup it doesn't cause any trouble and it was also in before 0.67.4.

Actions #14

Updated by Dan Mick about 10 years ago

Yeah, but I think it's data-dependent; must be max len and not have a zero byte directly after...

Actions #15

Updated by Danny Al-Gaaf about 10 years ago

Do you take a look at it (or do you have already a fix?) or should I?

The problem is that block_name_prefix in rbd_image_info_t should be [RBD_MAX_BLOCK_NAME_SIZE+1] if you want to get a 24 char string as it currently is. An we need to add a '\0' if there is none copied from ictx->object_prefix.

But this would change librbd or we simply change the block_name_prefix to have max. 23 usable chars + '\0'.

Actions #16

Updated by Dan van der Ster about 10 years ago

The other solution would be to copy block_name_prefix to a local null terminated in the function where it is printed out, or?

Actions #17

Updated by Danny Al-Gaaf about 10 years ago

Dan van der Ster wrote:

The other solution would be to copy block_name_prefix to a local null terminated in the function where it is printed out, or?

You could do that, but since this is a object from the librbd function, I would prefer to fix it directly in the librbd code instead since then all existing code that uses librbd would work fine.

Actions #18

Updated by Dan van der Ster about 10 years ago

Danny Al-Gaaf wrote:

Dan van der Ster wrote:

The other solution would be to copy block_name_prefix to a local null terminated in the function where it is printed out, or?

You could do that, but since this is a object from the librbd function, I would prefer to fix it directly in the librbd code instead since then all existing code that uses librbd would work fine.

OK, I was worried that other uses of the field would possibly copy the \0 into the actual prefix to access objects.

Actions #19

Updated by Danny Al-Gaaf about 10 years ago

Dan van der Ster wrote:

Danny Al-Gaaf wrote:

Dan van der Ster wrote:

The other solution would be to copy block_name_prefix to a local null terminated in the function where it is printed out, or?

You could do that, but since this is a object from the librbd function, I would prefer to fix it directly in the librbd code instead since then all existing code that uses librbd would work fine.

OK, I was worried that other uses of the field would possibly copy the \0 into the actual prefix to access objects.

hm ... good question. I don't know what other code does here and I'm currently not so sure what you can expect from a char[] ... can you expect a \0 terminated string? RBD_MAX_BLOCK_NAME_SIZE suggests to me the name could also be less than 24 but in this case it would need to be \0 terminated to work or not?

Actions #20

Updated by Dan Mick about 10 years ago

I'm not working on this, just calling imprecations from the bleachers. :) I think the issue is that, if the string is <= 23 chars, the copy-into-output-struct copies the NUL, but if it's 24 or greater, the NUL is not copied. And you would only notice the bug if string+25 is also something other than NUL that prints in a way you'd notice, so it depends on the state of memory at the time.

The right fix is clearly to allocate 24+1 for the output struct and ensure the NUL is added on output.

Actions #21

Updated by Sage Weil about 10 years ago

  • Assignee changed from Josh Durgin to Dan Mick
Actions #22

Updated by Danny Al-Gaaf about 10 years ago

Dan Mick wrote:

I'm not working on this, just calling imprecations from the bleachers. :) I think the issue is that, if the string is <= 23 chars, the copy-into-output-struct copies the NUL, but if it's 24 or greater, the NUL is not copied. And you would only notice the bug if string+25 is also something other than NUL that prints in a way you'd notice, so it depends on the state of memory at the time.

The right fix is clearly to allocate 24+1 for the output struct and ensure the NUL is added on output.

Only to be clear: you would propose to block_name_prefix/RBD_MAX_BLOCK_NAME_SIZE in librbd.h to 25 chars. We need to check if this affects API/ABI.

Checked with abi-compliance-checker:
API: Compatible (only two low "Problems with Data Types")
ABI: Incompatible (14.2%) e.g.:
  • rbd_obj_header_ondisk: Layout of parameter's stack of several functions has been changed and therefore parameters at higher positions in the stack may be incorrectly initialized by applications.
  • ImageCtx:
  1. The class has only inline or auto-generated constructors which will be copied to applications at compile time and will allocate an older memory layout. Call of any exported method of this class may access a memory outside the allocated objects or inside the older memory structure and result in crash or incorrect behavior of applications.
  2. 2) The memory layout and size of subclasses will be changed.
  • rbd_image_info_t: Type of field block_name_prefix has been changed from char24 (24 bytes) to char25 (25 bytes). This field may be incorrectly initialized or accessed by applications.

Question: is a ABI break acceptable? Or should we change it to 23+'\0' instead?

Actions #23

Updated by Ian Colle about 10 years ago

  • Status changed from New to In Progress
Actions #24

Updated by Sage Weil about 10 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF