Bug #57059
openceph mds dump tree - root inode is not in cache
0%
Description
Observed on octopus 15.2.16. and probably affecting any newer version.
It is not possible to dump stray buckets in MDS cache; ceph-user thread: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/G63US2VE3AQMSFCNO5TGO7NTBO57HDUC/
Part 1:¶
In file https://github.com/ceph/ceph/blob/main/src/mds/MDSRank.cc:
3111 void MDSRank::command_dump_tree(const cmdmap_t &cmdmap, std::ostream &ss, Formatter *f) 3112 { 3113 std::string root; 3114 int64_t depth; 3115 cmd_getval(cmdmap, "root", root); 3116 if (root.empty()) { 3117 root = "/"; 3118 } 3119 if (!cmd_getval(cmdmap, "depth", depth)) 3120 depth = -1; 3121 std::lock_guard l(mds_lock); 3122 CInode *in = mdcache->cache_traverse(filepath(root.c_str())); 3123 if (!in) { 3124 ss << "root inode is not in cache"; 3125 return; 3126 } 3127 f->open_array_section("inodes"); 3128 mdcache->dump_tree(in, 0, depth, f); 3129 f->close_section(); 3130 }
the error message in line 3124 is both, misleading and unhelpful. It should be changed to something like
ss << "inode for path '" << filepath(root.c_str()) << "' is not in cache";
to give an indication of what the command actually tries to find, which often is not the root inode.
Part 2¶
Trying to dump a tree under a stray bucket fails for an unknown reason. Dumping any tree under "/" succeeds:
This command works:
[root@rit-tceph ~]# ceph tell mds.0 dump tree '/' | jq ".[] | .dirfrags |.[] | .path" 2022-08-07T17:25:34.430+0200 7fbcfbfff700 0 client.439291 ms_handle_reset on v2:10.41.24.14:6812/3943985176 2022-08-07T17:25:34.473+0200 7fbd017fa700 0 client.456018 ms_handle_reset on v2:10.41.24.14:6812/3943985176 "/data/blobs" "/data" ""
However, this does not:
[root@rit-tceph ~]# ceph tell mds.0 dump tree '~mds0/stray0' | jq ".[] | .dirfrags |.[] | .path" 2022-08-07T17:27:16.623+0200 7fb294ff9700 0 client.439345 ms_handle_reset on v2:10.41.24.14:6812/3943985176 2022-08-07T17:27:16.665+0200 7fb295ffb700 0 client.456072 ms_handle_reset on v2:10.41.24.14:6812/3943985176 root inode is not in cache
The dir "~mds0/stray0" is in cache though (see the dump in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/PDJWTUAL2GRM7DYVBQT6BQLOJGOFIE4O/).
For the problem with dumping the stray buckets, the modification requested in Part 1 would help a lot with debugging. It is possible that the "~" symbol is interpreted by the command parser or its something as simple as a UTF-whatever conversion that goes wrong. Knowing the exact contents of filepath(root.c_str()) at the time of failure would almost certainly give a good lead.
Part 3¶
Whenever executing a "ceph tell mds. ..." command, these really annoying messages show up:
2022-08-07T17:27:16.623+0200 7fb294ff9700 0 client.439345 ms_handle_reset on v2:10.41.24.14:6812/3943985176 2022-08-07T17:27:16.665+0200 7fb295ffb700 0 client.456072 ms_handle_reset on v2:10.41.24.14:6812/3943985176
Are they indicating a problem? If not, would it be possible to stop dumping these to the root console?
Updated by Laura Flores over 1 year ago
- Translation missing: en.field_tag_list set to low-hanging-fruit
Updated by Laura Flores 10 months ago
- Translation missing: en.field_tag_list changed from low-hanging-fruit to low-hanging-fruit, open-source-day
Updated by Laura Flores 8 months ago
- Translation missing: en.field_tag_list changed from low-hanging-fruit, open-source-day to low-hanging-fruit
Updated by Laura Flores 7 months ago
Part 1 might be a good piece for beginners to work on.
Claiming for Grace Hopper Open Source Day.
Updated by Laura Flores 7 months ago
- Status changed from New to Fix Under Review