Project

General

Profile

Bug #15106

ceph.py does 'fs get ...' and breaks upgrade tests

Added by Sage Weil about 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  description: rados/upgrade/{rados.yaml hammer-x-singleton/{0-cluster/start.yaml 1-hammer-install/hammer.yaml
    2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/{rbd-cls.yaml
    rbd-import-export.yaml readwrite.yaml snaps-few-objects.yaml} 6-next-mon/monb.yaml
    7-workload/{radosbench.yaml rbd_api.yaml} 8-next-mon/monc.yaml 9-workload/{ec-rados-plugin=jerasure-k=3-m=1.yaml
    rbd-python.yaml rgw-swift.yaml snaps-many-objects.yaml test_cache-pool-snaps.yaml}}}

2016-03-12T07:40:44.294 INFO:teuthology.orchestra.run.smithi011:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph fs get cephfs --format=json-pretty'
2016-03-12T07:40:44.663 INFO:teuthology.orchestra.run.smithi011.stderr:no valid command found; 10 closest matches:
2016-03-12T07:40:44.663 INFO:teuthology.orchestra.run.smithi011.stderr:fs new <fs_name> <metadata> <data>
2016-03-12T07:40:44.664 INFO:teuthology.orchestra.run.smithi011.stderr:fs reset <fs_name> {--yes-i-really-mean-it}
2016-03-12T07:40:44.664 INFO:teuthology.orchestra.run.smithi011.stderr:fs rm <fs_name> {--yes-i-really-mean-it}
2016-03-12T07:40:44.664 INFO:teuthology.orchestra.run.smithi011.stderr:fs ls
2016-03-12T07:40:44.664 INFO:teuthology.orchestra.run.smithi011.stderr:Error EINVAL: invalid command

/a/sage-2016-03-12_06:01:42-rados-wip-sage-testing---basic-smithi/55486

Related issues

Related to CephFS - Bug #15124: ceph.py 'fs get ...' doesn't handle old installed ceph version Resolved 03/14/2016
Related to CephFS - Fix #15134: multifs: test case exercising mds_thrash for multiple filesystems Resolved

Associated revisions

Revision 9e202b44 (diff)
Added by John Spray about 8 years ago

tasks/cephfs: support old mdsmap command during setup

While Filesystem at large requires the new commands, for
use from the `ceph` task we must support old style commands,
as the ceph task is used to instantiate old clusters during
upgrade testing.

Fixes: #15124, #15049, #15106

Signed-off-by: John Spray <>

History

#1 Updated by Kefu Chai about 8 years ago

  • Status changed from New to Rejected
  • Assignee set to Kefu Chai

i ran into this problem also, see http://qa-proxy.ceph.com/teuthology/kchai-2016-03-11_19:18:43-rados-wip-kefu-testing---basic-smithi/54865/teuthology.log.

we were waiting for a healthy cluster in the "ceph" task:

    if ctx.cluster.only(teuthology.is_type('mds')).remotes:
        # Some MDSs exist, wait for them to be healthy
        ceph_fs = Filesystem(ctx)
        ceph_fs.wait_for_daemons(timeout=300)

and in filesystem.py, we check the mds map in are_daemons_healthy():

def get_mds_map(self):
    fs = json.loads(self.mon_manager.raw_cluster_cmd("fs", "get", self.name, "--format=json-pretty"))
    return fs['mdsmap']

this is why we have:

2016-03-11T22:09:09.941 INFO:teuthology.orchestra.run.smithi067:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph fs get cephfs --format=json-pretty'
2016-03-11T22:09:10.055 INFO:teuthology.orchestra.run.smithi067.stderr:no valid command found; 10 closest matches:
2016-03-11T22:09:10.055 INFO:teuthology.orchestra.run.smithi067.stderr:fsid
2016-03-11T22:09:10.056 INFO:teuthology.orchestra.run.smithi067.stderr:Error EINVAL: invalid command

because this change in filesystem.py in ceph-qa-suite is part of multiple filesystem support. while multi fs support (https://github.com/ceph/ceph/pull/6953) was not merged until couple days ago. so our testing branches were not new enough to include john's changes, but the run is new enough to include the change in ceph-qa-suite (https://github.com/ceph/ceph-qa-suite/pull/824) when the qa runs were submitted to teuthology.

so i am closing this issue as "rejected"

#2 Updated by Nathan Cutler about 8 years ago

  • Status changed from Rejected to 12

@Kefu: reopening, because this issue is still affecting the upgrade tests. For example, this job:

  description: upgrade:infernalis-x/parallel/{4-jewel.yaml 0-cluster/start.yaml 1-infernalis-install/infernalis.yaml
    2-workload/{ec-rados-default.yaml rados_api.yaml rados_loadgenbig.yaml test_rbd_api.yaml
    test_rbd_python.yaml} 3-upgrade-sequence/upgrade-all.yaml 5-final-workload/{rados-snaps-few-objects.yaml
    rados_loadgenmix.yaml rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml
    rgw_swift.yaml} distros/centos_7.2.yaml}

It installs infernalis and then runs the "ceph" task as you describe in http://tracker.ceph.com/issues/15106#note-1 . . . It then fails with the error message shown in the description, presumably because https://github.com/ceph/ceph/pull/6953 has not been backported to infernalis.

#3 Updated by Nathan Cutler about 8 years ago

The command line for triggering this suite included --suite upgrade/infernalis-x --suite-branch master . . . Obviously, the error is arising because the "master" branch of ceph-qa-suite assumes that 'fs get . . .' will be available, even though it is missing in infernalis.

#4 Updated by Nathan Cutler about 8 years ago

I see in the test definition:

  - install:
      branch: infernalis
  - print: '**** done installing infernalis'
  - ceph: null
  - print: '**** done ceph'

Can we add "branch:infernalis" to the ceph task like this?

  - install:
      branch: infernalis
  - print: '**** done installing infernalis'
  - ceph:
      branch: infernalis
  - print: '**** done ceph'

#5 Updated by Nathan Cutler about 8 years ago

  • Project changed from Ceph to CephFS

#6 Updated by Nathan Cutler about 8 years ago

  • Related to Bug #15124: ceph.py 'fs get ...' doesn't handle old installed ceph version added

#7 Updated by Nathan Cutler about 8 years ago

Nevermind - jcsp and loicd answered my question out-of-band.

#8 Updated by Nathan Cutler about 8 years ago

  • Related to Fix #15134: multifs: test case exercising mds_thrash for multiple filesystems added

#9 Updated by Nathan Cutler about 8 years ago

There are three instances of "fs get" in filesystem.py:

smithfarm@wilbur:~/src/ceph/smithfarm/ceph-qa-suite> grep -n '"fs", "get"' tasks/cephfs/filesystem.py 
133:                    "fs", "get", fs['name'], "--format=json-pretty"))['mdsmap']
327:        fs = json.loads(self.mon_manager.raw_cluster_cmd("fs", "get", self.name, "--format=json-pretty"))
345:        fs = json.loads(self.mon_manager.raw_cluster_cmd("fs", "get", self.name, "--format=json-pretty"))

The fallback (seen in the infernalis version of this function) is "mds dump":

def get_mds_map(self):
      """ 
      Return the MDS map, as a JSON-esque dict from 'mds dump'
      """ 
      return json.loads(self.mon_manager.raw_cluster_cmd('mds', 'dump', '--format=json-pretty'))

Still don't know how to test if "fs get" is available, though.

#10 Updated by Kefu Chai about 8 years ago

  • Status changed from 12 to Resolved
  • Assignee changed from Kefu Chai to John Spray

Also available in: Atom PDF