Bug #57796: after rebalance of pool via pgupmap balancer, continuous issues in monitor log - RADOS - Ceph

Actions

Copy link

Bug #57796

open

after rebalance of pool via pgupmap balancer, continuous issues in monitor log

Added by Chris Durham over 1 year ago. Updated about 1 year ago.

Status:

Need More Info

Priority:

Normal

Assignee:

Category:

Monitor

Target version:

% Done:

Source:

Community (user)

Tags:

pg upmap

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v16.2.9

ceph-qa-suite:

Component(RADOS):

pgmap

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

The pgupmap balancer was not balancing well, and after setting mgr/balancer/upmap_max_deviation to 1 (ceph config-key ...), the balancer kicked in and moved things around, resulting in a nicely balanced set of osds and pgs. Awesome.

However, it appears, that after the rebalance, the monitor logs are filling up (/var/log/ceph/ceph-mon.servername.log), every three minutes, with a line for every OSD that was affected by this rebalance. Those lines are of the following form:

2022-10-07T17:10:39.619+0000 7f7c2786d700 1 verify_upmap unable to get parent of osd.497, skipping for now

So, if the rebalance affected around 100 OSDs, there are around 100 lines of this form in my monitor log every 3 minutes. The pool in question is an ec pool.
I know the rebalance creates pg upmap items. But why does this warning/error happen, and is it a problem?

The pool with these osds (only 1) uses a custom crush root of the form:

root mycustomroot
rack rack1
pod pod1
host host1
host host2
pod pod2
host host3
host host4
rack rack2
pod pod3
host host5
...

In typing this up, I noticed that the hosts are also part of the 'default' crush root that no pool uses. Perhaps that is the issue...? Please advise.

Files

upmap.txt (3.04 KB) upmap.txt

Chris Durham, 10/10/2022 06:30 PM

Related issues 1 (1 open — 0 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #57796

after rebalance of pool via pgupmap balancer, continuous issues in monitor log

Updated by Chris Durham over 1 year ago

Updated by Chris Durham over 1 year ago

Updated by Chris Durham over 1 year ago

Updated by Radoslaw Zarzynski over 1 year ago

Updated by Chris Durham over 1 year ago

Updated by Radoslaw Zarzynski over 1 year ago

Updated by Radoslaw Zarzynski over 1 year ago

Updated by Ilya Dryomov about 1 year ago