Bug #38941: Error when enabling mgr module 'restful' - mgr - Ceph

Actions

Copy link

Bug #38941

closed

Error when enabling mgr module 'restful'

Added by Guillaume Abrioux about 5 years ago. Updated almost 5 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

Target version:

Ceph - v14.0.0

% Done:

Source:

Development

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Looks like some jobs in teuthology are failing when deploying ceph using ceph-ansible with the following error:

2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout:failed: [ovh028.front.sepia.ceph.com -> ovh010.front.sepia.ceph.com] (item=restful) => {
2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout: "changed": true,
2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout: "cmd": [
2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout: "ceph",
2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout: "--cluster",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "ceph",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "mgr",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "module",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "enable",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "restful"
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: ],
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "delta": "0:00:00.291293",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "end": "2019-03-25 04:44:05.477688",
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout: "item": "restful",
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout: "rc": 2,
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout: "start": "2019-03-25 04:44:05.186395"
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout:}
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout:STDERR:
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:Error ENOENT: all mgr daemons do not support module 'restful', pass --force to force enablement
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:MSG:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:non-zero return code

I couldn't reproduce this issue outside of teuthology context:

[root@mon0 ~]# ceph mgr module disable restful
[root@mon0 ~]# ceph mgr module enable restful
[root@mon0 ~]# ceph status
cluster:
id: 08a81dcb-bd86-406b-93e5-3fafa57ee3b8
health: HEALTH_OK

services:
    mon:        3 daemons, quorum mon0,mon1,mon2 (age 64m)
    mgr:        mgr0(active, since 5s), standbys: mgr1

so I'm thinking of a possible race condition in 'pending_map' here : https://github.com/ceph/ceph/commit/d953198aff3e5aaa6e2eadcf8a53c9e0279a30de#diff-0bd5017ae61455d420a77a69ed75b4d4R626

Is there a case where the cluster would report 'restful' module wouldn't be available on all mgr even if they are running the same version?

All mons and mgrs are running the same ceph version when this error occurs:

2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout:ok: [ovh042.front.sepia.ceph.com] => {
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout: "ansible_facts": {
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout: "ceph_version": "14.2.0-458-g673f1e8"
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout: },
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout: "changed": false
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout:}
2019-03-25T04:42:44.052 INFO:teuthology.orchestra.run.ovh027.stdout:ok: [ovh028.front.sepia.ceph.com] => {
2019-03-25T04:42:44.052 INFO:teuthology.orchestra.run.ovh027.stdout: "ansible_facts": {
2019-03-25T04:42:44.052 INFO:teuthology.orchestra.run.ovh027.stdout: "ceph_version": "14.2.0-458-g673f1e8"
2019-03-25T04:42:44.052 INFO:teuthology.orchestra.run.ovh027.stdout: },
2019-03-25T04:42:44.053 INFO:teuthology.orchestra.run.ovh027.stdout: "changed": false
2019-03-25T04:42:44.053 INFO:teuthology.orchestra.run.ovh027.stdout:}

Actions

Copy link

Updated by Tim Serong about 5 years ago

I'm guessing `ceph mgr module enable restful` is happening very quickly after ceph mgr starts. We dealt with this in DeepSea by waiting on `'test "$(ceph mgr dump | jq .available)" = "true"` in a loop (see https://github.com/SUSE/DeepSea/pull/1563)

Actions

Copy link