Project

General

Profile

Actions

Bug #41535

open

Trying to upgrade from Ceph Mimic to Nautilus can fail

Added by Chris MacNaughton over 4 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/mimic-x
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When trying to upgrade from Mimic to Nautilus, the clients on the Mimic nodes fail to talk to the monitor cluster as soon as the first unit is upgraded:

  1. ceph -s
    Traceback (most recent call last):
    File "/usr/bin/ceph", line 1241, in <module>
    retval = main()
    File "/usr/bin/ceph", line 1165, in main
    sigdict = parse_json_funcsigs(outbuf.decode('utf-8'), 'cli')
    File "/usr/lib/python3/dist-packages/ceph_argparse.py", line 788, in parse_json_funcsigs
    cmd['sig'] = parse_funcsig(cmd['sig'])
    File "/usr/lib/python3/dist-packages/ceph_argparse.py", line 728, in parse_funcsig
    raise JsonFormat(s)
    ceph_argparse.JsonFormat: unknown type CephBool

The monitor cluster is actually still functional according to that same command on the newly Nautilus unit:

  1. ceph -s
    cluster:
    id: 2734983c-c88f-11e9-b6f9-fa163e8108c2
    health: HEALTH_OK

    services:
    mon: 3 daemons, quorum juju-a792f2-ceph-mimic-upgrade-0,juju-a792f2-ceph-mimic-upgrade-1,juju-a792f2-ceph-mimic-upgrade-2 (age 3m)
    mgr: juju-a792f2-ceph-mimic-upgrade-0(active), standbys: juju-a792f2-ceph-mimic-upgrade-1, juju-a792f2-ceph-mimic-upgrade-2
    osd: 3 osds: 3 up, 3 in
    rgw: 1 daemon active (juju-a792f2-ceph-mimic-upgrade-6)

    data:
    pools: 15 pools, 46 pgs
    objects: 187 objects, 1.1 KiB
    usage: 3.0 GiB used, 27 GiB / 30 GiB avail
    pgs: 46 active+clean


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #39355: running ceph command on a partially upgraded cluster might failClosed04/17/2019

Actions
Actions #1

Updated by Nathan Cutler over 4 years ago

  • Related to Bug #39355: running ceph command on a partially upgraded cluster might fail added
Actions #2

Updated by Greg Farnum over 4 years ago

Exactly which versions are you running here? CephBool is new in Nautilus and we had a bug with mixed-version mon clusters but it should have been resolved prior to release: https://github.com/ceph/ceph/pull/25470

I wonder if we mucked up the feature flags so that the proper detection failed somehow.

Actions #3

Updated by Chris MacNaughton over 4 years ago

It looks like this has been resolved by: ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)

I'm not 100% sure which specific revision I was running against before but it looks good now

Actions #4

Updated by Dan van der Ster over 4 years ago

We have just seen this too upgrading from v13.2.6 to v14.2.3.

If a mimic client runs `ceph status` against a nautilus mon, it dumps the `ceph_argparse.JsonFormat: unknown type CephBool` error.

Actions

Also available in: Atom PDF