Project

General

Profile

Bug #9355

rbd: map fails with EINVAL inside a container

Added by Josh Durgin about 7 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
rbd
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

https://lists.linuxcontainers.org/pipermail/lxc-users/2013-October/005795.html

carmstrong on irc had the same issue with 3.15 in a container inside coreos. With debugging enabled, the only dout() triggered was

[ 874.254853] rbd: Error adding device 172.17.8.100:6789 name=admin,key=client.admin deis db
, indicating a problem very early in initialization. Userspace rbd commands worked correctly.

Disabling cephx had no effect.

History

#1 Updated by Chris Armstrong about 7 years ago

A fellow member of the CoreOS community is also running into this: https://groups.google.com/forum/#!topic/coreos-user/d-ySGISJjjc

#2 Updated by Chris Armstrong about 7 years ago

Seeing the same issue on a 3.16.2 kernel:

[ 2301.583112] rbd:  Error adding device 172.17.8.100:6789,172.17.8.101:6789,172.17.8.102:6789 name=admin,key=client.admin rbd logs

#3 Updated by Chris Armstrong about 7 years ago

For posterity, recording my conversation with Josh here. http://irclogs.ceph.widodh.nl/index.php?date=2014-09-04

[2:16] <carmstrong> well, running the container as --privileged allowed me to get back the device creation error, but now I'm getting `rbd: add failed: (22) Invalid argument`
[2:16] <carmstrong> I'm running `rbd map $pool/$name`
[2:16] <carmstrong> tried also specifying --pool separately
[3:23] <joshd> carmstrong: to get more logging out of the kernel you can 'mount -t debugfs none /sys/kernel/debug', run https://raw.githubusercontent.com/ceph/ceph/master/src/script/kcon_all.sh and then try rbd map again - it'll appear in dmesg
[4:05] <carmstrong> ok. [ 874.254853] rbd: Error adding device 172.17.8.100:6789 name=admin,key=client.admin deis db
[4:05] <carmstrong> potentially an auth issue?
[4:06] <joshd> anything before that?
[4:13] <carmstrong> joshd: some docker-related networking, but that's about it
[4:33] <joshd> carmstrong: that means it's failing very early, not even talking over the network
[4:34] <carmstrong> joshd: not a good sign :(
[4:34] <joshd> I'm suspicious it may have to do with the way auth info is passed to the kernel (maybe some extra capability is needed inside a container)
[4:34] <joshd> can you map one outside of a container with this kernel?
[4:35] <carmstrong> I'm unable to install the ceph packages in the root CoreOS machine. do you know of another way to test it?
[4:35] <joshd> which version is the kernel?
[4:37] <carmstrong> 3.15.8
[4:37] <joshd> it's worth trying in the container with cephx auth disabled ('auth supported = none' in the [global] section of /etc/ceph/ceph.conf on every node and restart the cluster)
[4:40] <carmstrong> that's just where I was headed! ok lemme try
[4:44] <joshd> carmstrong: with auth disabled it's also easy to try out on the host with echo "172.17.8.100:6789 name=admin deis db" > /sys/bus/rbd/add
[4:45] <joshd> it'll show up as /dev/rbd0 if it works, and you can remove it with echo 0 > /sys/bus/rbd/remove
[21:46] <carmstrong> joshd1: so with auth disabled, doing an echo "172.17.8.100:6789 name=admin deis test" > /sys/bus/rbd/add as root causes a hang for a few minutes, then the machine crashes and reboots
[21:46] <carmstrong> that's on the host machine
[21:50] <joshd1> carmstrong: any log of the crash in syslog or dmesg or anything?
[21:50] <carmstrong> joshd1: dmesg only has the most recent boot, nothing before. it mentioned that the system journal wasn't closed correctly because of the crash, but that's about it
[1:25] <joshd1> carmstrong: that hang on the host may have been http://tracker.ceph.com/issues/8818, which started occurring in 3.15 (fixed in the stable kernel trees now). it's certainly separate from the issue inside the container, so it seems there'd need to be extra debugging added to the rbd module to figure out what the EINVAL is coming from
[1:25] * alram (~alram@cpe-172-250-2-46.socal.res.rr.com) Quit (Quit: leaving)
[1:25] <carmstrong> joshd1: gotcha. thanks for all your help
[1:25] <carmstrong> I'm going down the route of just using the radosgw for now for blob storage
[1:25] <carmstrong> and we'll revisit the RBD volume in the future
[1:29] <joshd1> carmstrong: you're welcome, that makes sense for now. I'll add a bug about the container issue
[1:37] <joshd1> carmstrong: http://tracker.ceph.com/issues/9355
[1:39] <carmstrong> joshd1: I also commented on 8818, in case anyone else with my kernel and coreos stumbles across it

#4 Updated by Chris Armstrong about 7 years ago

Here's some debugging after disabling auth.

As root on the CoreOS host, echoing directly into the RBD bus also does not work (although I swear it did at one point):

deis-1 core # echo "172.17.8.100:6789,172.17.8.101:6789,172.17.8.102:6789 name=admin rbd logs" > /sys/bus/rbd/add
bash: echo: write error: Invalid argument
deis-1 core # echo "172.17.8.100:6789 name=admin rbd logs" > /sys/bus/rbd/add
bash: echo: write error: Invalid argument

From within the container, I can see that the RBD was indeed created:

rbd info logs
rbd info rbd image 'logs':
    size 4096 MB in 1024 objects
    order 22 (4096 kB objects)
    block_name_prefix: rb.0.101f.2ae8944a
    format: 1

Here are the debugging logs from doing the echo "172.17.8.100:6789 name=admin rbd logs" > /sys/bus/rbd/add

[ 1005.037537] libceph:  parse_options ffff88002c3bb000 options 'name=admin' dev_name '172.17.8.100:6789 name=admin rbd logs
'
[ 1005.039069] libceph:  parse_ips on '172.17.8.100:6789'
[ 1005.039651] libceph:  parse_ips got 172.17.8.100:6789
[ 1005.040145] libceph:  got string token 6 val admin
[ 1005.040862] rbd:  rbd_client_create:
[ 1005.041464] libceph:  ceph_messenger_init ffff88002be91088
[ 1005.042053] libceph:  init
[ 1005.042731] libceph:  auth_init name 'admin'
[ 1005.043180] libceph:  auth_init name admin
[ 1005.044288] libceph:  ceph_msg_new ffff88002c1f3000 front 20
[ 1005.044809] libceph:  ceph_msg_new ffff88002c1f30f8 front 96
[ 1005.045215] libceph:  ceph_msg_new ffff88002c1f31f0 front 4096
[ 1005.045496] libceph:  ceph_msg_new ffff88002c1f32e8 front 4096
[ 1005.046001] libceph:  con_init ffff88002be912c8
[ 1005.046258] libceph:  con_sock_state_init con ffff88002be912c8 sock 0 -> 1
[ 1005.046839] libceph:  init
[ 1005.047249] libceph:  msgpool osd_op init
[ 1005.048392] libceph:  ceph_msg_new ffff88002c1f3d90 front 4096
[ 1005.048954] libceph:  msgpool_alloc (null) ffff88002c1f3d90
[ 1005.049465] libceph:  ceph_msg_new ffff88002c1f3e88 front 4096
[ 1005.049971] libceph:  msgpool_alloc (null) ffff88002c1f3e88
[ 1005.050415] libceph:  ceph_msg_new ffff88002c1f33e0 front 4096
[ 1005.050965] libceph:  msgpool_alloc (null) ffff88002c1f33e0
[ 1005.051422] libceph:  ceph_msg_new ffff88002c1f34d8 front 4096
[ 1005.051928] libceph:  msgpool_alloc (null) ffff88002c1f34d8
[ 1005.052418] libceph:  ceph_msg_new ffff88002c1f35d0 front 4096
[ 1005.052918] libceph:  msgpool_alloc (null) ffff88002c1f35d0
[ 1005.053418] libceph:  ceph_msg_new ffff88002c1f36c8 front 4096
[ 1005.053928] libceph:  msgpool_alloc (null) ffff88002c1f36c8
[ 1005.054405] libceph:  ceph_msg_new ffff88002c1f37c0 front 4096
[ 1005.054905] libceph:  msgpool_alloc (null) ffff88002c1f37c0
[ 1005.055407] libceph:  ceph_msg_new ffff88002c1f38b8 front 4096
[ 1005.055935] libceph:  msgpool_alloc (null) ffff88002c1f38b8
[ 1005.056411] libceph:  ceph_msg_new ffff88002c1f39b0 front 4096
[ 1005.056913] libceph:  msgpool_alloc (null) ffff88002c1f39b0
[ 1005.057407] libceph:  ceph_msg_new ffff88002c1f3aa8 front 4096
[ 1005.057935] libceph:  msgpool_alloc (null) ffff88002c1f3aa8
[ 1005.059249] libceph:  msgpool osd_op_reply init
[ 1005.059894] libceph:  ceph_msg_new ffff88002c1f3ba0 front 512
[ 1005.060373] libceph:  msgpool_alloc (null) ffff88002c1f3ba0
[ 1005.060975] libceph:  ceph_msg_new ffff88002c1f3c98 front 512
[ 1005.061655] libceph:  msgpool_alloc (null) ffff88002c1f3c98
[ 1005.062239] libceph:  ceph_msg_new ffff88002bdf8000 front 512
[ 1005.062622] libceph:  msgpool_alloc (null) ffff88002bdf8000
[ 1005.063158] libceph:  ceph_msg_new ffff88002bdf80f8 front 512
[ 1005.064212] libceph:  msgpool_alloc (null) ffff88002bdf80f8
[ 1005.064851] libceph:  ceph_msg_new ffff88002bdf81f0 front 512
[ 1005.065390] libceph:  msgpool_alloc (null) ffff88002bdf81f0
[ 1005.065955] libceph:  ceph_msg_new ffff88002bdf82e8 front 512
[ 1005.066373] libceph:  msgpool_alloc (null) ffff88002bdf82e8
[ 1005.066902] libceph:  ceph_msg_new ffff88002bdf8ba0 front 512
[ 1005.067377] libceph:  msgpool_alloc (null) ffff88002bdf8ba0
[ 1005.067942] libceph:  ceph_msg_new ffff88002bdf8c98 front 512
[ 1005.068373] libceph:  msgpool_alloc (null) ffff88002bdf8c98
[ 1005.068901] libceph:  ceph_msg_new ffff88002bdf8d90 front 512
[ 1005.069408] libceph:  msgpool_alloc (null) ffff88002bdf8d90
[ 1005.069932] libceph:  ceph_msg_new ffff88002bdf8e88 front 512
[ 1005.070408] libceph:  msgpool_alloc (null) ffff88002bdf8e88
[ 1005.070985] libceph:  open_session start
[ 1005.071229] libceph:  open_session num=1 r=62 -> mon0
[ 1005.071548] libceph:  open_session mon0 opening
[ 1005.071983] libceph:  con_open ffff88002be912c8 172.17.8.100:6789
[ 1005.073224] libceph:  queue_con_delay ffff88002be912c8 0
[ 1005.073749] libceph:  con_work: con ffff88002be912c8 PREOPEN
[ 1005.073993] libceph:  try_read start on ffff88002be912c8 state 2
[ 1005.074559] libceph:  try_write start ffff88002be912c8 state 2
[ 1005.075064] libceph:  try_write out_kvec_bytes 0
[ 1005.075353] libceph:  prepare_read_banner ffff88002be912c8
[ 1005.075839] libceph:  try_write initiating connect on ffff88002be912c8 new state 3
[ 1005.076669] libceph:  connect 172.17.8.100:6789
[ 1005.077105] libceph:  con_sock_state_connecting con ffff88002be912c8 sock 1 -> 2
[ 1005.078948] libceph:  auth_build_hello
[ 1005.079495] libceph:  ceph_sock_state_change ffff88002be912c8 state = 3 sk_state = 1
[ 1005.080338] libceph:  ceph_sock_state_change TCP_ESTABLISHED
[ 1005.080477] libceph:  con_sock_state_connected con ffff88002be912c8 sock 2 -> 3
[ 1005.081196] libceph:  queue_con_delay ffff88002be912c8 0
[ 1005.082173] libceph:  connect 172.17.8.100:6789 EINPROGRESS sk_state = 1
[ 1005.083397] libceph:  write_partial_kvec ffff88002be912c8 145 left
[ 1005.084017] libceph:  ceph_sock_data_ready on ffff88002be912c8 state = 3, queueing work
[ 1005.084646] libceph:  queue_con_delay ffff88002be912c8 - already queued
[ 1005.085544] libceph:  ceph_sock_data_ready on ffff88002be912c8 state = 3, queueing work
[ 1005.086424] libceph:  queue_con_delay ffff88002be912c8 - already queued
[ 1005.087209] libceph:  write_partial_kvec ffff88002be912c8 0 left in 0 kvecs ret = 1
[ 1005.087992] libceph:  try_write nothing else to write.
[ 1005.088345] libceph:  try_write done on ffff88002be912c8 ret 0
[ 1005.089148] libceph:  ----- ffff88002c1f32e8 to mon0 17=auth len 60+0+0 -----
[ 1005.089417] libceph:  queue_con_delay ffff88002be912c8 - already queued
[ 1005.089959] libceph:  __schedule_delayed after 10000
[ 1005.090980] libceph:  mount waiting for mon_map
[ 1005.091373] libceph:  try_read start on ffff88002be912c8 state 3
[ 1005.091925] libceph:  try_read tag 1 in_base_pos 0
[ 1005.092335] libceph:  try_read connecting
[ 1005.092759] libceph:  read_partial_banner ffff88002be912c8 at 0
[ 1005.093267] libceph:  process_banner on ffff88002be912c8
[ 1005.093390] libceph:  prepare_write_connect ffff88002be912c8 cseq=0 gseq=1 proto=15
[ 1005.094167] libceph:  prepare_read_connect ffff88002be912c8
[ 1005.095332] libceph:  try_read done on ffff88002be912c8 ret 0
[ 1005.095879] libceph:  try_write start ffff88002be912c8 state 4
[ 1005.096404] libceph:  try_write out_kvec_bytes 33
[ 1005.096843] libceph:  write_partial_kvec ffff88002be912c8 33 left
[ 1005.097812] libceph:  write_partial_kvec ffff88002be912c8 0 left in 0 kvecs ret = 1
[ 1005.098103] libceph:  try_write nothing else to write.
[ 1005.098564] libceph:  try_write done on ffff88002be912c8 ret 0
[ 1005.099094] libceph:  ceph_sock_data_ready on ffff88002be912c8 state = 4, queueing work
[ 1005.100414] libceph:  queue_con_delay ffff88002be912c8 0
[ 1005.101446] libceph:  try_read start on ffff88002be912c8 state 4
[ 1005.101977] libceph:  try_read tag 1 in_base_pos 0
[ 1005.102351] libceph:  try_read negotiating
[ 1005.103269] libceph:  read_partial_connect ffff88002be912c8 at 0
[ 1005.103974] libceph:  read_partial_connect ffff88002be912c8 tag 1, con_seq = 1, g_seq = 22
[ 1005.104657] libceph:  process_connect on ffff88002be912c8 tag 1
[ 1005.105213] libceph:  process_connect got READY gseq 22 cseq 1 (1)
[ 1005.106046] libceph:  prepare_read_tag ffff88002be912c8
[ 1005.106597] libceph:  try_read start on ffff88002be912c8 state 5
[ 1005.107125] libceph:  try_read tag 1 in_base_pos 0
[ 1005.108116] libceph:  try_read done on ffff88002be912c8 ret 0
[ 1005.109321] libceph:  try_write start ffff88002be912c8 state 5
[ 1005.109993] libceph:  try_write out_kvec_bytes 0
[ 1005.110650] libceph:  prepare_write_message ffff88002c1f32e8 seq 1 type 17 len 60+0+0
[ 1005.111509] libceph:  prepare_write_message front_crc 3679117932 middle_crc 0
[ 1005.112414] libceph:  prepare_write_message_footer ffff88002be912c8
[ 1005.113286] libceph:  try_write out_kvec_bytes 127
[ 1005.113811] libceph:  write_partial_kvec ffff88002be912c8 127 left
[ 1005.114345] libceph:  write_partial_kvec ffff88002be912c8 0 left in 0 kvecs ret = 1
[ 1005.115032] libceph:  try_write nothing else to write.
[ 1005.115365] libceph:  try_write done on ffff88002be912c8 ret 0
[ 1005.116346] libceph:  ceph_sock_data_ready on ffff88002be912c8 state = 5, queueing work
[ 1005.117642] libceph:  queue_con_delay ffff88002be912c8 0
[ 1005.118058] libceph:  try_read start on ffff88002be912c8 state 5
[ 1005.118585] libceph:  try_read tag 1 in_base_pos 0
[ 1005.119020] libceph:  try_read got tag 8
[ 1005.119298] libceph:  prepare_read_ack ffff88002be912c8
[ 1005.119786] libceph:  got ack for seq 1 type 17 at ffff88002c1f32e8
[ 1005.120376] libceph:  prepare_read_tag ffff88002be912c8
[ 1005.120870] libceph:  try_read start on ffff88002be912c8 state 5
[ 1005.121364] libceph:  try_read tag 1 in_base_pos 0
[ 1005.121827] libceph:  try_read got tag 7
[ 1005.122492] libceph:  prepare_read_message ffff88002be912c8
[ 1005.123002] libceph:  read_partial_message con ffff88002be912c8 msg           (null)
[ 1005.124003] libceph:  ceph_sock_data_ready on ffff88002be912c8 state = 5, queueing work
[ 1005.125311] libceph:  queue_con_delay ffff88002be912c8 0
[ 1005.126473] libceph:  got hdr type 4 front 481 data 0
[ 1005.126596] libceph:  ceph_msg_new ffff88002bdf83e0 front 481
[ 1005.127047] libceph:  read_partial_message got msg ffff88002bdf83e0 481 (1798859476) + 0 (0) + 0 (0)
[ 1005.128372] libceph:  ===== ffff88002bdf83e0 1 from mon0 4=mon_map len 481+0 (1798859476 0 0) =====
[ 1005.129190] libceph:  have_debugfs_info fsid 0 globalid 0
[ 1005.129674] libceph:  handle_monmap
[ 1005.130045] libceph:  monmap_decode ffff88002bcb7c04 ffff88002bcb7de1 len 477
[ 1005.130483] libceph:  monmap_decode epoch 3, num_mon 3
[ 1005.130974] libceph:  monmap_decode  mon0 is 172.17.8.100:6789
[ 1005.132045] libceph:  monmap_decode  mon1 is 172.17.8.101:6789
[ 1005.132545] libceph:  monmap_decode  mon2 is 172.17.8.102:6789
[ 1005.133037] libceph:  have_debugfs_info fsid 1 globalid 0
[ 1005.133387] libceph:  ceph_msg_put last one on ffff88002bdf83e0
[ 1005.133904] libceph:  msg_kfree ffff88002bdf83e0
[ 1005.134514] libceph:  prepare_read_tag ffff88002be912c8
[ 1005.134991] libceph:  try_read start on ffff88002be912c8 state 5
[ 1005.135378] libceph:  try_read tag 1 in_base_pos 0
[ 1005.135896] libceph:  try_read got tag 7
[ 1005.136336] libceph:  prepare_read_message ffff88002be912c8
[ 1005.136821] libceph:  read_partial_message con ffff88002be912c8 msg           (null)
[ 1005.137719] libceph:  got hdr type 18 front 33 data 0
[ 1005.138221] libceph:  read_partial_message got msg ffff88002c1f31f0 33 (483680619) + 0 (0) + 0 (0)
[ 1005.139710] libceph:  ===== ffff88002c1f31f0 2 from mon0 18=auth_reply len 33+0 (483680619 0 0) =====
[ 1005.140608] libceph:  have_debugfs_info fsid 1 globalid 0
[ 1005.141158] libceph:  handle_auth_reply ffff88002be90000 ffff88002be90021
[ 1005.141392] libceph:   result 0 '' gid 4233 len 9
[ 1005.141885] libceph:   set global_id 0 -> 4233
[ 1005.142509] libceph:  ceph_x_init ffff880076eb7b80
[ 1005.142998] libceph: no secret set (for auth_x protocol)
[ 1005.143340] libceph: error -22 on auth protocol 2 init
[ 1005.143790] libceph:  have_debugfs_info fsid 1 globalid 4233
[ 1005.144357] libceph: client4233 fsid 2eeeae05-c64d-4648-8212-cfea0bc0a4d1
[ 1005.144892] libceph:  ceph_debugfs_client_init ffff88002be91000 2eeeae05-c64d-4648-8212-cfea0bc0a4d1.client4233
[ 1005.146403] libceph:  destroy_client ffff88002be91000
[ 1005.146935] libceph:  remove_all_osds ffff88002be91768
[ 1005.147361] libceph:  msgpool osd_op destroy
[ 1005.147765] libceph:  msgpool_release osd_op ffff88002c1f3aa8
[ 1005.148266] libceph:  ceph_msg_put last one on ffff88002c1f3aa8
[ 1005.148433] libceph:  msg_kfree ffff88002c1f3aa8
[ 1005.148856] libceph:  msgpool_release osd_op ffff88002c1f39b0
[ 1005.149372] libceph:  ceph_msg_put last one on ffff88002c1f39b0
[ 1005.149892] libceph:  msg_kfree ffff88002c1f39b0
[ 1005.150318] libceph:  msgpool_release osd_op ffff88002c1f38b8
[ 1005.150798] libceph:  ceph_msg_put last one on ffff88002c1f38b8
[ 1005.151358] libceph:  msg_kfree ffff88002c1f38b8
[ 1005.151823] libceph:  msgpool_release osd_op ffff88002c1f37c0
[ 1005.152481] libceph:  ceph_msg_put last one on ffff88002c1f37c0
[ 1005.152968] libceph:  msg_kfree ffff88002c1f37c0
[ 1005.153336] libceph:  msgpool_release osd_op ffff88002c1f36c8
[ 1005.153872] libceph:  ceph_msg_put last one on ffff88002c1f36c8
[ 1005.154352] libceph:  msg_kfree ffff88002c1f36c8
[ 1005.154771] libceph:  msgpool_release osd_op ffff88002c1f35d0
[ 1005.155246] libceph:  ceph_msg_put last one on ffff88002c1f35d0
[ 1005.156255] libceph:  msg_kfree ffff88002c1f35d0
[ 1005.156744] libceph:  msgpool_release osd_op ffff88002c1f34d8
[ 1005.156993] libceph:  ceph_msg_put last one on ffff88002c1f34d8
[ 1005.157518] libceph:  msg_kfree ffff88002c1f34d8
[ 1005.157937] libceph:  msgpool_release osd_op ffff88002c1f33e0
[ 1005.158348] libceph:  ceph_msg_put last one on ffff88002c1f33e0
[ 1005.158930] libceph:  msg_kfree ffff88002c1f33e0
[ 1005.159313] libceph:  msgpool_release osd_op ffff88002c1f3e88
[ 1005.159790] libceph:  ceph_msg_put last one on ffff88002c1f3e88
[ 1005.160355] libceph:  msg_kfree ffff88002c1f3e88
[ 1005.160785] libceph:  msgpool_release osd_op ffff88002c1f3d90
[ 1005.161281] libceph:  ceph_msg_put last one on ffff88002c1f3d90
[ 1005.161968] libceph:  msg_kfree ffff88002c1f3d90
[ 1005.162366] libceph:  msgpool osd_op_reply destroy
[ 1005.162796] libceph:  msgpool_release osd_op_reply ffff88002bdf8e88
[ 1005.163432] libceph:  ceph_msg_put last one on ffff88002bdf8e88
[ 1005.163921] libceph:  msg_kfree ffff88002bdf8e88
[ 1005.164320] libceph:  msgpool_release osd_op_reply ffff88002bdf8d90
[ 1005.164872] libceph:  ceph_msg_put last one on ffff88002bdf8d90
[ 1005.165404] libceph:  msg_kfree ffff88002bdf8d90
[ 1005.165892] libceph:  msgpool_release osd_op_reply ffff88002bdf8c98
[ 1005.166369] libceph:  ceph_msg_put last one on ffff88002bdf8c98
[ 1005.166860] libceph:  msg_kfree ffff88002bdf8c98
[ 1005.167284] libceph:  msgpool_release osd_op_reply ffff88002bdf8ba0
[ 1005.168417] libceph:  ceph_msg_put last one on ffff88002bdf8ba0
[ 1005.168975] libceph:  msg_kfree ffff88002bdf8ba0
[ 1005.169359] libceph:  msgpool_release osd_op_reply ffff88002bdf82e8
[ 1005.169866] libceph:  ceph_msg_put last one on ffff88002bdf82e8
[ 1005.170403] libceph:  msg_kfree ffff88002bdf82e8
[ 1005.170825] libceph:  msgpool_release osd_op_reply ffff88002bdf81f0
[ 1005.171344] libceph:  ceph_msg_put last one on ffff88002bdf81f0
[ 1005.171975] libceph:  msg_kfree ffff88002bdf81f0
[ 1005.172377] libceph:  msgpool_release osd_op_reply ffff88002bdf80f8
[ 1005.173031] libceph:  ceph_msg_put last one on ffff88002bdf80f8
[ 1005.173160] libceph:  msg_kfree ffff88002bdf80f8
[ 1005.173578] libceph:  msgpool_release osd_op_reply ffff88002bdf8000
[ 1005.174099] libceph:  ceph_msg_put last one on ffff88002bdf8000
[ 1005.175051] libceph:  msg_kfree ffff88002bdf8000
[ 1005.175470] libceph:  msgpool_release osd_op_reply ffff88002c1f3c98
[ 1005.175969] libceph:  ceph_msg_put last one on ffff88002c1f3c98
[ 1005.176349] libceph:  msg_kfree ffff88002c1f3c98
[ 1005.176812] libceph:  msgpool_release osd_op_reply ffff88002c1f3ba0
[ 1005.177393] libceph:  ceph_msg_put last one on ffff88002c1f3ba0
[ 1005.177879] libceph:  msg_kfree ffff88002c1f3ba0
[ 1005.178293] libceph:  stop
[ 1005.178775] libceph:  __close_session closing mon0
[ 1005.179204] libceph:  ceph_msg_revoke_incoming msg ffff88002c1f31f0 null con
[ 1005.180391] libceph:  ceph_msg_revoke_incoming msg ffff88002c1f3000 null con
[ 1005.180967] libceph:  con_close ffff88002be912c8 peer 172.17.8.100:6789
[ 1005.181464] libceph:  reset_connection ffff88002be912c8
[ 1005.181914] libceph:  con_close_socket on ffff88002be912c8 sock ffff880028c3ac80
[ 1005.185305] libceph:  ceph_sock_state_change ffff88002be912c8 state = 1 sk_state = 4
[ 1005.186046] libceph:  ceph_sock_state_change ffff88002be912c8 state = 1 sk_state = 5
[ 1005.186602] libceph:  ceph_sock_state_change ffff88002be912c8 state = 1 sk_state = 7
[ 1005.187358] libceph:  ceph_sock_state_change TCP_CLOSE
[ 1005.187976] libceph:  ceph_sock_state_change TCP_CLOSE_WAIT
[ 1005.188557] libceph:  con_sock_state_closing con ffff88002be912c8 sock 3 -> 4
[ 1005.189108] libceph:  queue_con_delay ffff88002be912c8 0
[ 1005.189333] libceph:  ceph_sock_state_change ffff88002be912c8 state = 1 sk_state = 7
[ 1005.190119] libceph:  ceph_sock_state_change TCP_CLOSE
[ 1005.190337] libceph:  ceph_sock_state_change TCP_CLOSE_WAIT
[ 1005.190805] libceph:  con_sock_state_closing con ffff88002be912c8 sock 4 -> 4
[ 1005.192205] libceph:  queue_con_delay ffff88002be912c8 - already queued
[ 1005.193423] libceph:  con_sock_state_closed con ffff88002be912c8 sock 4 -> 1
[ 1005.194016] libceph:  try_read start on ffff88002be912c8 state 1
[ 1005.194383] libceph:  try_write start ffff88002be912c8 state 1
[ 1005.194876] libceph:  try_write out_kvec_bytes 0
[ 1005.195384] libceph:  try_write nothing else to write.
[ 1005.195840] libceph:  try_write done on ffff88002be912c8 ret 0
[ 1005.196504] libceph:  con_work: con ffff88002be912c8 CLOSED
[ 1005.197166] libceph:  auth_reset ffff880076eb7b80
[ 1005.197354] libceph:  auth_destroy ffff880076eb7b80
[ 1005.197841] libceph:  ceph_msg_put last one on ffff88002c1f32e8
[ 1005.198382] libceph:  msg_kfree ffff88002c1f32e8
[ 1005.198810] libceph:  ceph_msg_put last one on ffff88002c1f31f0
[ 1005.199336] libceph:  msg_kfree ffff88002c1f31f0
[ 1005.200401] libceph:  ceph_msg_put last one on ffff88002c1f30f8
[ 1005.200991] libceph:  msg_kfree ffff88002c1f30f8
[ 1005.201490] libceph:  ceph_msg_put last one on ffff88002c1f3000
[ 1005.202009] libceph:  msg_kfree ffff88002c1f3000
[ 1005.202112] libceph:  ceph_debugfs_client_cleanup ffff88002be91000
[ 1005.202854] libceph:  destroy_options ffff88002c3bb000
[ 1005.203372] libceph:  destroy_client ffff88002be91000 done
[ 1005.203881] rbd:  rbd_client_create: error -22
[ 1005.204539] rbd:  Error adding device 172.17.8.100:6789 name=admin rbd logs

#5 Updated by Chris Armstrong about 7 years ago

Note that we only see the debug output when we're trying to write to the RBD bus directly on the host - from within the container, the only generated debug line is something like:

[ 2333.926518] rbd:  Error adding device 172.17.8.100:6789,172.17.8.101:6789,172.17.8.102:6789 name=admin,key=client.admin deis logs

#6 Updated by Ilya Dryomov about 7 years ago

  • Status changed from New to 12
  • Assignee set to Ilya Dryomov

#7 Updated by Ilya Dryomov about 7 years ago

The

[ 2333.926518] rbd:  Error adding device 172.17.8.100:6789,172.17.8.101:6789,172.17.8.102:6789 name=admin,key=client.admin deis logs

when doing rbd map from inside the container is because containers are run in namespaces other than init_net and libceph can't cope with that:

commit eea553c21fbfa486978c82525ee8256239d4f921
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Thu Jan 31 02:09:50 2013 -0800

    ceph: Only allow mounts in the initial network namespace

    Today ceph opens tcp sockets from a delayed work callback.  Delayed
    work happens from kernel threads which are always in the initial
    network namespace.   Therefore fail early if someone attempts
    to mount a ceph filesystem from something other than the initial
    network namespace.

Fixing this would require a significant amount of restructuring at the very least..

#8 Updated by Chris Armstrong about 7 years ago

Thanks for the update, Ilya! You actually gave me a hint as to a workaround - run the container with `--net host` so that the network context of the container is the same as the host's. Then, I had to bind-mount /sys so that it's not read-only within the container. I'm now running our container like this:

docker run -i -v /sys:/sys --net host 172.21.12.100:5000/deis/store-base:git-3d4ca8f /bin/bash

I just confirmed that I can indeed `rbd map` now!!

#9 Updated by Ilya Dryomov about 7 years ago

  • Status changed from 12 to Closed

Opened #9753.

Also available in: Atom PDF