Project

General

Profile

Bug #25141

CephVolumeClient: delay required after adding data pool to MDSMap

Added by Patrick Donnelly 5 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
High
Category:
Manila
Target version:
Start date:
07/27/2018
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
VolumeClient
Labels (FS):
task(easy)
Pull request ID:

Description

https://github.com/ceph/ceph/blob/726ca169ee23df2f5521ca0c9677614c6e3a2145/src/pybind/ceph_volume_client.py#L632-L636

The CephVolumeClient needs to wait a nominal amount of time for the MDS to receive the latest MDSMap.

Example failure: http://pulpito.ceph.com/pdonnell-2018-07-27_22:41:15-fs-wip-pdonnell-testing-20180727.210741-distro-basic-smithi/2826013/

2018-07-27 23:04:32.497 7f67e0f5f700  7 mds.0.server dispatch_client_request client_request(client.24123:6 setxattr #0x10000000002 ceph.dir.layout.pool 2018-07-27 23:04:32.493787 caller_uid=0, caller_gid=0{}) v4
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.0.server rdlock_path_pin_ref request(client.24123:6 cr=0x55595bf0cc00) #0x10000000002
2018-07-27 23:04:32.497 7f67e0f5f700  7 mds.0.cache traverse: opening base ino 0x10000000002 snap head
2018-07-27 23:04:32.497 7f67e0f5f700 10  mds.0.cache.snaprealm(0x10000000002 seq 1 0x55595bf69180) have_past_parents_open [1,head]
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.0.cache path_traverse finish on snapid head
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.0.server ref is [inode 0x10000000002 [...2,head] /volumes/grpid/volid/ auth v4 snaprealm=0x55595bf69180 dirtyparent f() n(v0 rc2018-07-27 23:04:31.762034 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) caps={24123=pAsxLsXsxFsx/-@2},l=24123 | request=0 lock=0 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=0 0x55595bde5800]
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.0.cache.ino(0x10000000002) auth_pin by 0x55595beced00 on [inode 0x10000000002 [...2,head] /volumes/grpid/volid/ auth v4 ap=1+0 snaprealm=0x55595bf69180 dirtyparent f() n(v0 rc2018-07-27 23:04:31.762034 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) caps={24123=pAsxLsXsxFsx/-@2},l=24123 | request=0 lock=0 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x55595bde5800] now 1+0
2018-07-27 23:04:32.497 7f67e0f5f700 15 mds.0.cache.dir(0x10000000001) adjust_nested_auth_pins 1/1 on [dir 0x10000000001 /volumes/grpid/ [2,head] auth v=5 cv=0/0 ap=0+1+1 state=1610874881|complete f(v0 m2018-07-27 23:04:31.760497 1=0+1) n(v0 rc2018-07-27 23:04:31.762034 1=0+1) hs=1+0,ss=0+0 dirty=1 | child=1 dirty=1 authpin=0 0x55595becd400] by 0x55595bde5800 count now 0 + 1
2018-07-27 23:04:32.497 7f67e0f5f700 15 mds.0.cache.dir(0x10000000000) adjust_nested_auth_pins 1/0 on [dir 0x10000000000 /volumes/ [2,head] auth v=7 cv=0/0 ap=0+0+1 state=1610874881|complete f(v0 m2018-07-27 23:04:31.758957 1=0+1) n(v0 rc2018-07-27 23:04:31.762034 2=0+2)/n(v0 rc2018-07-27 23:04:31.758957 1=0+1) hs=1+0,ss=0+0 dirty=1 | child=1 dirty=1 authpin=0 0x55595becca00] by 0x55595bde5800 count now 0 + 1
2018-07-27 23:04:32.497 7f67e0f5f700 15 mds.0.cache.dir(0x1) adjust_nested_auth_pins 1/0 on [dir 0x1 / [2,head] auth v=5 cv=1/1 REP dir_auth=0 ap=0+0+1 state=1610874881|complete f(v0 m2018-07-27 23:04:31.755604 1=0+1) n(v0 rc2018-07-27 23:04:31.758957 2=0+2)/n() hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 authpin=0 0x55595bc3f700] by 0x55595bde5800 count now 0 + 1
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.0.server handle_set_vxattr ceph.dir.layout.pool val 14 bytes on [inode 0x10000000002 [...2,head] /volumes/grpid/volid/ auth v4 ap=1+0 snaprealm=0x55595bf69180 dirtyparent f() n(v0 rc2018-07-27 23:04:31.762034 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) caps={24123=pAsxLsXsxFsx/-@2},l=24123 | request=1 lock=0 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=1 0x55595bde5800]
2018-07-27 23:04:32.497 7f67e0f5f700 20 Session check_access path /volumes/grpid/volid
2018-07-27 23:04:32.497 7f67e0f5f700 10 MDSAuthCap is_capable inode(path /volumes/grpid/volid owner 0:0 mode 040755) by caller 0:0 mask 64 new 53:0 cap: MDSAuthCaps[allow *]
2018-07-27 23:04:32.497 7f67e0f5f700 20 mds.0.server parse_layout_vxattr name layout.pool value 'fsvolume_volid'
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.0.server  invalid data pool 10
2018-07-27 23:04:32.497 7f67e0f5f700  7 mds.0.server reply_client_request -22 ((22) Invalid argument) client_request(client.24123:6 setxattr #0x10000000002 ceph.dir.layout.pool 2018-07-27 23:04:32.493787 caller_uid=0, caller_gid=0{}) v4
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.0.server apply_allocated_inos 0x0 / [] / 0x0
2018-07-27 23:04:32.497 7f67e0f5f700 20 mds.0.server lat 0.000297
2018-07-27 23:04:32.497 7f67e0f5f700  1 -- 172.21.15.169:6816/1793859549 --> 172.21.15.104:0/3634343627 -- client_reply(???:6 = -22 (22) Invalid argument safe) v1 -- ?+0 0x55595bf40b00 con 0x55595be59000
2018-07-27 23:04:32.497 7f67e0f5f700  7 mds.0.cache request_finish request(client.24123:6 cr=0x55595bf0cc00)
2018-07-27 23:04:32.497 7f67e0f5f700 15 mds.0.cache request_cleanup request(client.24123:6 cr=0x55595bf0cc00)
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.0.cache.ino(0x10000000002) auth_unpin by 0x55595beced00 on [inode 0x10000000002 [...2,head] /volumes/grpid/volid/ auth v4 snaprealm=0x55595bf69180 dirtyparent f() n(v0 rc2018-07-27 23:04:31.762034 1=0+1) (iauth excl) (inest lock) (ifile excl) (ixattr excl) (iversion lock) caps={24123=pAsxLsXsxFsx/-@2},l=24123 | request=1 lock=0 dirfrag=1 caps=1 dirtyparent=1 dirty=1 authpin=0 0x55595bde5800] now 0+0
2018-07-27 23:04:32.497 7f67e0f5f700 15 mds.0.cache.dir(0x10000000001) adjust_nested_auth_pins -1/-1 on [dir 0x10000000001 /volumes/grpid/ [2,head] auth v=5 cv=0/0 state=1610874881|complete f(v0 m2018-07-27 23:04:31.760497 1=0+1) n(v0 rc2018-07-27 23:04:31.762034 1=0+1) hs=1+0,ss=0+0 dirty=1 | child=1 dirty=1 authpin=0 0x55595becd400] by 0x55595beced00 count now 0 + 0
2018-07-27 23:04:32.497 7f67e0f5f700 15 mds.0.cache.dir(0x10000000000) adjust_nested_auth_pins -1/0 on [dir 0x10000000000 /volumes/ [2,head] auth v=7 cv=0/0 state=1610874881|complete f(v0 m2018-07-27 23:04:31.758957 1=0+1) n(v0 rc2018-07-27 23:04:31.762034 2=0+2)/n(v0 rc2018-07-27 23:04:31.758957 1=0+1) hs=1+0,ss=0+0 dirty=1 | child=1 dirty=1 authpin=0 0x55595becca00] by 0x55595beced00 count now 0 + 0
2018-07-27 23:04:32.497 7f67e0f5f700 15 mds.0.cache.dir(0x1) adjust_nested_auth_pins -1/0 on [dir 0x1 / [2,head] auth v=5 cv=1/1 REP dir_auth=0 state=1610874881|complete f(v0 m2018-07-27 23:04:31.755604 1=0+1) n(v0 rc2018-07-27 23:04:31.758957 2=0+2)/n() hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 authpin=0 0x55595bc3f700] by 0x55595beced00 count now 0 + 0
2018-07-27 23:04:32.497 7f67e0f5f700  1 -- 172.21.15.169:6816/1793859549 <== mon.1 172.21.15.169:6789/0 7 ==== mdsmap(e 32) v1 ==== 655+0+0 (1163347864 0 0) 0x55595bf6b8c0 con 0x55595be59500
2018-07-27 23:04:32.497 7f67e0f5f700  1 mds.a-s Updating MDS map to version 32 from mon.1
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.a-s      my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.a-s  mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.a-s map says I am 172.21.15.169:6816/1793859549 mds.0.30 state up:active
2018-07-27 23:04:32.497 7f67e0f5f700 10 mds.a-s handle_mds_map: handling map as rank 0
2018-07-27 23:04:32.497 7f67e0f5f700  4 mds.0.30 set_osd_epoch_barrier: epoch=52
2018-07-27 23:04:32.503 7f67e0f5f700  1 -- 172.21.15.169:6816/1793859549 <== mon.1 172.21.15.169:6789/0 8 ==== osd_map(53..53 src has 1..53) v4 ==== 595+0+0 (2177951935 0 0) 0x55595bf7e280 con 0x55595be59500
2018-07-27 23:04:32.503 7f67e0f5f700 10 mds.0.snap check_osd_map need_to_purge={}
2018-07-27 23:04:32.503 7f67e0f5f700  7 mds.0.server operator(): full = 0 epoch = 53
2018-07-27 23:04:32.503 7f67e0f5f700  4 mds.0.30 handle_osd_map epoch 53, 0 new blacklist entries

From /ceph/teuthology-archive/pdonnell-2018-07-27_22:41:15-fs-wip-pdonnell-testing-20180727.210741-distro-basic-smithi/2826013/remote/smithi169/log/ceph-mds.a-s.log.gz


Related issues

Related to fs - Feature #23695: VolumeClient: allow ceph_volume_client to create 'volumes' without namespace isolation Resolved 04/12/2018
Copied to fs - Backport #25205: luminous: CephVolumeClient: delay required after adding data pool to MDSMap Resolved
Copied to fs - Backport #25206: mimic: CephVolumeClient: delay required after adding data pool to MDSMap Resolved

History

#1 Updated by Patrick Donnelly 5 months ago

  • Description updated (diff)
  • Status changed from New to In Progress

#2 Updated by Patrick Donnelly 5 months ago

  • Related to Feature #23695: VolumeClient: allow ceph_volume_client to create 'volumes' without namespace isolation added

#3 Updated by Patrick Donnelly 5 months ago

  • Status changed from In Progress to Need Review

#4 Updated by Patrick Donnelly 5 months ago

  • Status changed from Need Review to Pending Backport

#5 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #25205: luminous: CephVolumeClient: delay required after adding data pool to MDSMap added

#6 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #25206: mimic: CephVolumeClient: delay required after adding data pool to MDSMap added

#7 Updated by Nathan Cutler 3 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF