Project

General

Profile

Actions

Bug #59564

open

Connection scores not populated properly on monitors post installation

Added by Kamoltat (Junior) Sirivadhna about 1 year ago. Updated 3 months ago.

Status:
Pending Backport
Priority:
Normal
Category:
Stretch Clusters
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
reef, quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Description of problem:
WE are observing that the connection scores of the monitor with the highest rank is not populated properly after installation. It is missing the peer information.
More over, we have observed that the peer_reports contain the unexpected rank of -1.

All the mons are part of the quorum and ceph status is healthy..

Tried the scenario on multiple clusters with 3 & 5 mons. In each case mon with highest rank had ranks missing for other peer monitors.

# ceph mon dump
epoch 3
fsid 6b008e40-761f-11ed-9d70-fa163ed6291c
last_changed 2022-12-07T11:15:58.063080+0000
created 2022-12-07T11:08:25.190391+0000
min_mon_release 17 (quincy)
election_strategy: 1
0: [v2:10.0.210.25:3300/0,v1:10.0.210.25:6789/0] mon.ceph-pdhiran-6nkq6t-node1-installer
1: [v2:10.0.209.151:3300/0,v1:10.0.209.151:6789/0] mon.ceph-pdhiran-6nkq6t-node6
2: [v2:10.0.208.155:3300/0,v1:10.0.208.155:6789/0] mon.ceph-pdhiran-6nkq6t-node2
dumped monmap epoch 3

# ceph daemon mon.ceph-pdhiran-6nkq6t-node1-installer  connection scores dump
{
    "rank": 0,
    "epoch": 14,
    "version": 223,
    "half_life": 43200,
    "persist_interval": 10,
    "reports": {
        "report": {
            "rank": -1,
            "epoch": 0,
            "version": 0,
            "peer_scores": {}
        },
        "report": {
            "rank": 0,
            "epoch": 14,
            "version": 223,
            "peer_scores": {
                "peer": {
                    "peer_rank": 1,
                    "peer_score": 0.99995375391737207,
                    "peer_alive": true
                },
                "peer": {
                    "peer_rank": 2,
                    "peer_score": 0.99995374350154376,
                    "peer_alive": true
                }
            }
        },
        "report": {
            "rank": 1,
            "epoch": 14,
            "version": 168,
            "peer_scores": {
                "peer": {
                    "peer_rank": 0,
                    "peer_score": 1,
                    "peer_alive": true
                },
                "peer": {
                    "peer_rank": 2,
                    "peer_score": 0.99736158141292575,
                    "peer_alive": false
                }
            }
        },
        "report": {
            "rank": 2,
            "epoch": 14,
            "version": 111,
            "peer_scores": {
                "peer": {
                    "peer_rank": 0,
                    "peer_score": 1,
                    "peer_alive": true
                }
            }
        }
    }
}

Steps to Reproduce:
1. Bootstrap Ceph cluster
2. Deploy 5 Monitors
3. Observe that connection scores are not populated properly.


Related issues 2 (2 open0 closed)

Copied to RADOS - Backport #64003: reef: Connection scores not populated properly on monitors post installationNewKamoltat (Junior) SirivadhnaActions
Copied to RADOS - Backport #64004: quincy: Connection scores not populated properly on monitors post installationNewKamoltat (Junior) SirivadhnaActions
Actions #1

Updated by Kamoltat (Junior) Sirivadhna 12 months ago

  • Description updated (diff)
Actions #2

Updated by Kamoltat (Junior) Sirivadhna 12 months ago

  • Pull request ID set to 51320
Actions #3

Updated by Kamoltat (Junior) Sirivadhna 12 months ago

  • Status changed from New to Fix Under Review
Actions #4

Updated by Kamoltat (Junior) Sirivadhna 10 months ago

  • Pull request ID changed from 51320 to 52380
Actions #5

Updated by Neha Ojha 4 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to reef, quincy
Actions #6

Updated by Backport Bot 4 months ago

  • Copied to Backport #64003: reef: Connection scores not populated properly on monitors post installation added
Actions #7

Updated by Backport Bot 4 months ago

  • Copied to Backport #64004: quincy: Connection scores not populated properly on monitors post installation added
Actions #8

Updated by Backport Bot 4 months ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF