Project

General

Profile

Actions

Bug #38216

closed

"HEALTH_WARN 3 monitors have not enabled msgr2" in rados

Added by Yuri Weinstein about 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/yuriw-2019-02-04_20:42:30-upgrade:mimic-x-master-distro-basic-smithi/
Jobs: all
Logs: /a/yuriw-2019-02-04_20:42:30-upgrade:mimic-x-master-distro-basic-smithi/3550371/teuthology.log

2019-02-05T12:35:08.110 INFO:teuthology.misc.health.smithi121.stderr:2019-02-05 12:35:08.124 7f00ff1bb700 -1 WARNING: all dangerous and experimental features are enabled.
2019-02-05T12:35:08.197 INFO:teuthology.misc.health.smithi121.stderr:2019-02-05 12:35:08.210 7f00ff1bb700 -1 WARNING: all dangerous and experimental features are enabled.
2019-02-05T12:35:08.197 INFO:teuthology.misc.health.smithi121.stderr:2019-02-05 12:35:08.210 7f00ff1bb700 -1 asok(0x7f00f8000bf0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-client.admin.48071.asok': (13) Permission denied
2019-02-05T12:35:08.542 INFO:teuthology.misc.health.smithi121.stdout:HEALTH_WARN 3 monitors have not enabled msgr2
2019-02-05T12:35:08.557 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 3 monitors have not enabled msgr2
2019-02-05T12:35:09.109 INFO:tasks.rados.rados.0.smithi130.stdout:3724: done (0 left)
2019-02-05T12:35:09.110 INFO:tasks.rados.rados.0.smithi130.stdout:append oid 8 current snap is 367
2019-02-05T12:35:09.110 INFO:tasks.rados.rados.0.smithi130.stdout:3725:  seq_num 1148 ranges {3383296=32768}
2019-02-05T12:35:09.157 INFO:tasks.rados.rados.0.smithi130.stdout:3725:  writing smithi130768663-8 from 3383296 to 3416064 tid 1
2019-02-05T12:35:09.157 INFO:tasks.rados.rados.0.smithi130.stdout:3726: rollback oid 37 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo current snap is 367
2019-02-05T12:35:09.158 INFO:tasks.rados.rados.0.smithi130.stdout:rollback oid 37 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo to 362
2019-02-05T12:35:09.158 INFO:tasks.rados.rados.0.smithi130.stdout:3727: snap_create
2019-02-05T12:35:09.167 INFO:tasks.rados.rados.0.smithi130.stdout:3726:  finishing rollback tid 1 to smithi130768663-37 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
2019-02-05T12:35:09.167 INFO:tasks.rados.rados.0.smithi130.stdout:update_object_version oid 37 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo v 0 (ObjNum 22 snap 0 seq_num 22) dirty dne
2019-02-05T12:35:09.174 INFO:tasks.rados.rados.0.smithi130.stdout:3725:  finishing write tid 1 to smithi130768663-8
2019-02-05T12:35:09.175 INFO:tasks.rados.rados.0.smithi130.stdout:3725:  finishing write tid 3 to smithi130768663-8
2019-02-05T12:35:09.177 INFO:tasks.rados.rados.0.smithi130.stdout:3725:  finishing write tid 4 to smithi130768663-8
2019-02-05T12:35:09.178 INFO:tasks.rados.rados.0.smithi130.stdout:update_object_version oid 8 v 2643 (ObjNum 1148 snap 367 seq_num 1148) dirty exists
2019-02-05T12:35:09.178 INFO:tasks.rados.rados.0.smithi130.stdout:3725:  left oid 8 (ObjNum 1148 snap 367 seq_num 1148)
2019-02-05T12:35:09.558 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 86, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 65, in run_one_task
    return task(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/parallel.py", line 56, in task
    p.spawn(_run_spawned, ctx, confg, taskname)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/parallel.py", line 85, in __exit__
                                                                                                                                                                                        39723,1       93%
...
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/misc.py", line 936, in wait_until_healthy
    while proceed():
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/contextutil.py", line 132, in __call__
    raise MaxWhileTries(error_msg)
MaxWhileTries: 'wait_until_healthy' reached maximum tries (150) after waiting for 900 seconds
Actions #1

Updated by Yuri Weinstein about 5 years ago

looks like also cause for failed tests https://github.com/ceph/ceph/pull/26302

Actions #2

Updated by Yuri Weinstein about 5 years ago

  • Assignee set to Sage Weil

@Sage Weil what do you suggest ?

Actions #3

Updated by Nathan Cutler about 5 years ago

Apparently we'll need to add a command enabling msgr2 on the MONs, to be run after the upgrade completes. But I don't know what this command is.

Actions #4

Updated by Sage Weil about 5 years ago

  • Status changed from New to In Progress
  • Priority changed from Normal to Urgent

I have a big wip-v2-upgrade branch in progress that resolves all of the (many) upgrade problems.

Actions #5

Updated by Sage Weil about 5 years ago

  • Project changed from mgr to RADOS
Actions #7

Updated by Sage Weil about 5 years ago

  • Status changed from In Progress to Resolved
Actions #8

Updated by Greg Farnum about 5 years ago

  • Project changed from RADOS to Messengers
Actions

Also available in: Atom PDF