Project

General

Profile

Actions

Bug #5082

closed

OSD wrongly marked as down

Added by Ivan Kudryavtsev almost 11 years ago. Updated almost 11 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

During ceph crush manipulation

ceph osd crush set 17 osd.17 0.8 pool=default host=ceph-osd-2-1

I see messages like this:

2013-05-16 12:13:59.089557 mon.0 [INF] osdmap e26129: 37 osds: 32 up, 36 in
2013-05-16 12:15:30.739882 osd.31 [WRN] map e26100 wrongly marked me down

Actually, I have 36 working OSDs, but 4 of them are marked as down, however they are alive. I'm using replication factor 3 and as you can see, even with such replication factor I can be in situation when all OSDs with PG will be marked as offline. Is it correct and how it could be avoided.

after some seconds they become online again and I have 36 of 37 online which is OK.

Actions

Also available in: Atom PDF