Project

General

Profile

Actions

Bug #47694

closed

downgrading via ceph orch upgrade start results in partial application and mixed state

Added by Jan Fajerski over 3 years ago. Updated about 3 years ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Following https://docs.ceph.com/en/latest/cephadm/upgrade/#using-customized-container-images I attempted to downgrade my cluster.

The process starts fine but I end up in a weird state with two mgr daemons downgraded, the upgrade seemingly succeeded and a HEALTH_WARN.

Starting with a healthy cluster at version 15.2.5-220-gb758bfd693 (SUSE downstream container) I run ceph orch upgrade start --image <custome registry url>/containers/ses/7/containers/ses/7/ceph/ceph:15.2.0.108. This starts the process alright and I can see the progress of the image pull in ceph -s.

After a while this finishes and left the cluster in the following state:

master:~ # ceph versions
{
    "mon": {
        "ceph version 15.2.5-220-gb758bfd693 (b758bfd69359a0ffa10bd5426d64e7636bb0a6c6) octopus (stable)": 3
    },
    "mgr": {
        "ceph version 15.2.0-108-g8cf4f02b08 (8cf4f02b0814fc5dc803ae5923cb310bb08de967) octopus (stable)": 2,
        "ceph version 15.2.5-220-gb758bfd693 (b758bfd69359a0ffa10bd5426d64e7636bb0a6c6) octopus (stable)": 1
    },
    "osd": {
        "ceph version 15.2.5-220-gb758bfd693 (b758bfd69359a0ffa10bd5426d64e7636bb0a6c6) octopus (stable)": 20
    },
    "mds": {
        "ceph version 15.2.5-220-gb758bfd693 (b758bfd69359a0ffa10bd5426d64e7636bb0a6c6) octopus (stable)": 2
    },
    "overall": {
        "ceph version 15.2.0-108-g8cf4f02b08 (8cf4f02b0814fc5dc803ae5923cb310bb08de967) octopus (stable)": 2,
        "ceph version 15.2.5-220-gb758bfd693 (b758bfd69359a0ffa10bd5426d64e7636bb0a6c6) octopus (stable)": 26
    }
}
master:~ # ceph -s
  cluster:
    id:     2f578f24-02e5-11eb-92b7-52540064363c
    health: HEALTH_WARN
            4 hosts fail cephadm check
            failed to probe daemons or devices
            28 stray daemons(s) not managed by cephadm

  services:
    mon: 3 daemons, quorum master,node1,node2 (age 2h)
    mgr: node2.ibtqev(active, since 4m), standbys: master.wdjpkv, node1.cgixgj
    mds: sesdev_fs:1 {0=sesdev_fs.node3.jwnsyq=up:active} 1 up:standby
    osd: 20 osds: 20 up (since 2h), 20 in (since 2h)

  task status:
    scrub status:
        mds.sesdev_fs.node3.jwnsyq: idle

  data:
    pools:   3 pools, 65 pgs
    objects: 22 objects, 2.8 KiB
    usage:   20 GiB used, 140 GiB / 160 GiB avail
    pgs:     65 active+clean

The current active mgr could not be failed.

master:~ # ceph mgr fail ibtqev
Daemon not found 'ibtqev', already failed

I'm aware the upgrade command should probably not expected to handle a downgrade. I think some validation should probably be done to avoid this situation, if only to avoid users running into issues due to mistyping.


Related issues 1 (0 open1 closed)

Copied to Orchestrator - Bug #47702: upgrading via ceph orch upgrade start results in partial application and mixed stateCan't reproduce

Actions
Actions #1

Updated by Jan Fajerski over 3 years ago

  • Copied to Bug #47702: upgrading via ceph orch upgrade start results in partial application and mixed state added
Actions #2

Updated by Jan Fajerski over 3 years ago

After a while the status and versions are as expected. We should probably still put a validation against downgrades in place.

Actions #3

Updated by Sebastian Wagner about 3 years ago

  • Status changed from New to Won't Fix

we have to support downgrades to some degree. closing as it worked eventually

Actions

Also available in: Atom PDF