Project

General

Profile

Actions

Bug #43882

closed

osd to mon connection lost, osd stuck down

Added by Sage Weil about 4 years ago. Updated almost 4 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is a similar symptom to #43825, but it does not appear to be related to split/merge.

OSD is marked down, but up, but a bit (but not way) behind on OSDMaps. monc is not talking to the mon.

ceph daemon commands hang.

running under valgrind.

/a/sage-2020-01-28_23:42:06-rados-wip-sage-testing-2020-01-28-1413-distro-basic-smithi/4715655
description: rados/verify/{centos_latest.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml}
d-thrash/default/{default.yaml thrashosds-health.yaml} msgr-failures/few.yaml msgr/async-v1only.yaml
objectstore/bluestore-bitmap.yaml rados.yaml tasks/rados_api_tests.yaml validater/valgrind.yaml}

ceph-osd.1.log

Actions #1

Updated by Sage Weil about 4 years ago

  • Status changed from New to Need More Info
Actions #2

Updated by Sage Weil about 4 years ago

  • Subject changed from osd to mon connection lost, osd stuck down to monclient connection to mon failed
  • Status changed from Need More Info to In Progress
  • Assignee set to Sage Weil
Actions #3

Updated by Sage Weil about 4 years ago

  • Subject changed from monclient connection to mon failed to osd to mon connection lost, osd stuck down
  • Status changed from In Progress to Need More Info
  • Assignee deleted (Sage Weil)

i thought i reproduced this, but it was a bug in another PR i was testing.

Actions #4

Updated by Neha Ojha almost 4 years ago

  • Status changed from Need More Info to Can't reproduce
Actions

Also available in: Atom PDF