Bug #57796
openafter rebalance of pool via pgupmap balancer, continuous issues in monitor log
0%
Description
The pgupmap balancer was not balancing well, and after setting mgr/balancer/upmap_max_deviation to 1 (ceph config-key ...), the balancer kicked in and moved things around, resulting in a nicely balanced set of osds and pgs. Awesome.
However, it appears, that after the rebalance, the monitor logs are filling up (/var/log/ceph/ceph-mon.servername.log), every three minutes, with a line for every OSD that was affected by this rebalance. Those lines are of the following form:
2022-10-07T17:10:39.619+0000 7f7c2786d700 1 verify_upmap unable to get parent of osd.497, skipping for now
So, if the rebalance affected around 100 OSDs, there are around 100 lines of this form in my monitor log every 3 minutes. The pool in question is an ec pool.
I know the rebalance creates pg upmap items. But why does this warning/error happen, and is it a problem?
The pool with these osds (only 1) uses a custom crush root of the form:
root mycustomroot
rack rack1
pod pod1
host host1
host host2
pod pod2
host host3
host host4
rack rack2
pod pod3
host host5
...
In typing this up, I noticed that the hosts are also part of the 'default' crush root that no pool uses. Perhaps that is the issue...? Please advise.
Files
Updated by Chris Durham over 1 year ago
preformatting the crush info so it shows up properly ...
root mycustomroot rack rack1 pod pod1 host host1 host host2 pod pod2 host host3 host host4 rack rack2 pod pod3 host host5 ...
Updated by Chris Durham over 1 year ago
Note that the balancer balanced a replicated pool, using its own custom crush root too. The hosts in that pool (not in the ec pool affected) are also in the default crush root, but none of the verify_upmap log entries complain about osds in that pool.
Updated by Chris Durham over 1 year ago
I removed the hosts holding the osds reported by verify_upmap from the default root rule that no one uses, and the log entries continue
Updated by Radoslaw Zarzynski over 1 year ago
- Status changed from New to Need More Info
Thanks for the report! The log comes from there:
int CrushWrapper::verify_upmap(CephContext *cct,
int rule_id,
int pool_size,
const vector<int>& up)
{
// ...
{
int numrep = curstep->arg1;
int type = curstep->arg2;
if (numrep <= 0)
numrep += pool_size;
type_stack.emplace(type, numrep);
if (type == 0) // osd
break;
map<int, set<int>> osds_by_parent; // parent_of_desired_type -> osds
for (auto osd : up) {
auto parent = get_parent_of_type(osd, type, rule_id);
if (parent < 0) {
osds_by_parent[parent].insert(osd);
} else {
ldout(cct, 1) << __func__ << " unable to get parent of osd." << osd
<< ", skipping for now"
<< dendl;
}
}
It looks the verify_upmap
was looking for parents for those OSDs (which should always be CRUSH buckets) but got something with non-negative ID (which is weird).
Could you please provide dump the CRUSH map as well as the output ceph osd tree
?
Hints:
*https://docs.ceph.com/en/pacific/man/8/crushtool/
* ceph osd getcrushmap
Updated by Chris Durham over 1 year ago
Radoslaw,
Yes, I saw that piece of code too. But i think I figured it out just a short time ago. I had the crush hierarchy backwards. My crush rule has: pick racks(4)->pods(2)->host(1)(leaf). (It is a 6+2) EC Pool. So I get 8 chunks. But the hierarchy is that pods are HIGHER than racks. So I extracted the osdmap, and I ran: osdmaptool osdmap --upmap-cleanup. Doing so gives me the exact same errors as in the ceph-mon log for verify_upmap.
If I extract the crushmap from the osdmap, modify it to pick racks(4)->host(2)(leaf), put the crushmap back into the osdmap and run osdmaptool osdmap --upmap-cleanup, the verify_upmap messages do not occur, (but i get other upmap add/rm
My question is, if I actually deploy the crushmap without the pod choice (I can live without it), will I be ok or will it cause more problems given the current state. I am surprised that crush let me choose such a rule to begin with. The PGs look fine as to their OSDs and such.
Thanks
See the attached message I sent to ceph-users, that has what you asked.
Updated by Radoslaw Zarzynski over 1 year ago
- Related to Bug #51729: Upmap verification fails for multi-level crush rule added
Updated by Radoslaw Zarzynski over 1 year ago
Link to the discussion on ceph-users: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/AZHAIGY3BIM4SGBUBKX5ZGYTXQWAJ7OO/#H3S7LGDVWSTKQ6ZQXJIQTQAWI2VZXL2S.