Project

General

Profile

Bug #38941

Error when enabling mgr module 'restful'

Added by Guillaume Abrioux 3 months ago. Updated about 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
03/26/2019
Due date:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Looks like some jobs in teuthology are failing when deploying ceph using ceph-ansible with the following error:

2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout:failed: [ovh028.front.sepia.ceph.com -> ovh010.front.sepia.ceph.com] (item=restful) => {
2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout: "changed": true,
2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout: "cmd": [
2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout: "ceph",
2019-03-25T04:44:05.524 INFO:teuthology.orchestra.run.ovh027.stdout: "--cluster",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "ceph",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "mgr",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "module",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "enable",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "restful"
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: ],
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "delta": "0:00:00.291293",
2019-03-25T04:44:05.525 INFO:teuthology.orchestra.run.ovh027.stdout: "end": "2019-03-25 04:44:05.477688",
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout: "item": "restful",
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout: "rc": 2,
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout: "start": "2019-03-25 04:44:05.186395"
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout:}
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout:STDERR:
2019-03-25T04:44:05.526 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:Error ENOENT: all mgr daemons do not support module 'restful', pass --force to force enablement
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:MSG:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:
2019-03-25T04:44:05.527 INFO:teuthology.orchestra.run.ovh027.stdout:non-zero return code

I couldn't reproduce this issue outside of teuthology context:

[root@mon0 ~]# ceph mgr module disable restful
[root@mon0 ~]# ceph mgr module enable restful
[root@mon0 ~]# ceph status
cluster:
id: 08a81dcb-bd86-406b-93e5-3fafa57ee3b8
health: HEALTH_OK

services:
mon: 3 daemons, quorum mon0,mon1,mon2 (age 64m)
mgr: mgr0(active, since 5s), standbys: mgr1

so I'm thinking of a possible race condition in 'pending_map' here : https://github.com/ceph/ceph/commit/d953198aff3e5aaa6e2eadcf8a53c9e0279a30de#diff-0bd5017ae61455d420a77a69ed75b4d4R626

Is there a case where the cluster would report 'restful' module wouldn't be available on all mgr even if they are running the same version?

All mons and mgrs are running the same ceph version when this error occurs:

2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout:ok: [ovh042.front.sepia.ceph.com] => {
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout: "ansible_facts": {
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout: "ceph_version": "14.2.0-458-g673f1e8"
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout: },
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout: "changed": false
2019-03-25T04:42:44.002 INFO:teuthology.orchestra.run.ovh027.stdout:}
2019-03-25T04:42:44.052 INFO:teuthology.orchestra.run.ovh027.stdout:ok: [ovh028.front.sepia.ceph.com] => {
2019-03-25T04:42:44.052 INFO:teuthology.orchestra.run.ovh027.stdout: "ansible_facts": {
2019-03-25T04:42:44.052 INFO:teuthology.orchestra.run.ovh027.stdout: "ceph_version": "14.2.0-458-g673f1e8"
2019-03-25T04:42:44.052 INFO:teuthology.orchestra.run.ovh027.stdout: },
2019-03-25T04:42:44.053 INFO:teuthology.orchestra.run.ovh027.stdout: "changed": false
2019-03-25T04:42:44.053 INFO:teuthology.orchestra.run.ovh027.stdout:}

History

#1 Updated by Tim Serong 3 months ago

I'm guessing `ceph mgr module enable restful` is happening very quickly after ceph mgr starts. We dealt with this in DeepSea by waiting on `'test "$(ceph mgr dump | jq .available)" = "true"` in a loop (see https://github.com/SUSE/DeepSea/pull/1563)

#2 Updated by Brad Hubbard 3 months ago

  • Project changed from Ceph to mgr

#3 Updated by Brad Hubbard 3 months ago

  • Source set to Development

#4 Updated by Brad Hubbard 3 months ago

Related to #38853 ?

#5 Updated by Guillaume Abrioux 3 months ago

I've implemented the fix suggested by Tim Serong in c1 to deal with this issue at ceph-ansible level.

I guess we can close this since it looks more like an orchestration issue.

#6 Updated by Sebastian Wagner about 2 months ago

  • Status changed from New to Closed

Closed accordingly.

Also available in: Atom PDF