Project

General

Profile

Actions

Bug #5365

closed

Massive OSD flaps

Added by Ivan Kudryavtsev almost 11 years ago. Updated almost 11 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi, all.

Today I added one more node to my CEPH and it became unstable, i mean here that it's unable to work with 'nodown' cleared because osds are flapping constantly.

How it was:
I added to my crush map new host:
ceph-osd-1-2
and created two osds:
osd37 (XFS on SATA 3TB + SSD for journal)
osd38 (XFS on SATA 3TB + SSD for journal)
and added them to host with weight 0

next i trying to populate them with data little by little +0.25. After I did it, ceph started to kick off a lot of OSDs massively so I set 'ceph osd set nodown' because otherwise system was completely unusable as it kicked up to 70% of my 38 osds.

Now, I see

root@ceph-mon-mds-0:/var/log/ceph# ceph -s
health HEALTH_WARN nodown flag(s) set
monmap e1: 3 mons at {0=10.252.0.3:6789/0,1=10.252.0.4:6789/0,2=10.252.0.2:6789/0}, election epoch 5436, quorum 0,1,2 0,1,2
osdmap e33965: 39 osds: 38 up, 38 in
pgmap v10504240: 1344 pgs: 1344 active+clean; 3589 GB data, 10808 GB used, 25911 GB / 36720 GB avail; 1623KB/s wr, 149op/s
mdsmap e486: 1/1/1 up {0=2=up:active}, 2 up:standby

but if I clear nodown it starts to kick off osds as before. So I unable to function in usual way.

My ceph osd tree is:

# id    weight  type name       up/down reweight
-1      22.9    pool default
-11     14.65           datacenter zc-1
-10     14.65                   room zc-1-room-1
-8      7.05                            rack zc-1-rack-1
-4      7.05                                    host ceph-osd-2-1
1       0.4                                             osd.1   up      1
5       0.8                                             osd.5   up      1
6       0.25                                            osd.6   up      1
9       0.6                                             osd.9   up      1
12      0.6                                             osd.12  up      1
15      0.5                                             osd.15  up      1
16      0.25                                            osd.16  up      1
17      0.4                                             osd.17  up      1
18      0.25                                            osd.18  up      1
30      1                                               osd.30  up      1
31      1                                               osd.31  up      1
32      1                                               osd.32  up      1
-9      7.1                             rack zc-1-rack-5
-5      7.1                                     host ceph-osd-1-1
2       0.5                                             osd.2   up      1
7       0.4                                             osd.7   up      1
8       0.25                                            osd.8   up      1
11      0.5                                             osd.11  up      1
13      0.5                                             osd.13  up      1
19      0.6                                             osd.19  up      1
20      0.25                                            osd.20  up      1
21      0.5                                             osd.21  up      1
22      0.6                                             osd.22  up      1
27      1                                               osd.27  up      1
28      1                                               osd.28  up      1
29      1                                               osd.29  up      1
-13     0.5                             rack zc-1-rack-3
-12     0.5                                     host ceph-osd-1-2
37      0.25                                            osd.37  up      1
38      0.25                                            osd.38  up      1
-7      8.25            datacenter ms-1
-6      8.25                    room ms-1-room-1
-3      8.25                            rack ms-1-rack-1
-2      8.25                                    host ceph-osd-3-1
0       0.25                                            osd.0   up      1
3       0.25                                            osd.3   up      1
4       0.25                                            osd.4   up      1
10      0.25                                            osd.10  up      1
14      0.25                                            osd.14  up      1
23      1                                               osd.23  up      1
24      0.5                                             osd.24  up      1
25      1                                               osd.25  up      1
26      0.5                                             osd.26  up      1
33      1                                               osd.33  up      1
34      1                                               osd.34  up      1
35      1                                               osd.35  up      1
36      1                                               osd.36  down    0

osd.36 is down and it's as expected.

I've tried to do 'ceph osd set noout' before 'ceph osd unset nodown' but it starts to flap again. So I'm unable to go back to normal operation.

The only difference between old nodes and new one is ceph minor version:
new: ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)
old: ceph version 0.56.4 (63b0f854d1cef490624de5d6cf9039735c7de5ca)

Can it be the reason? But, from another side I'm unable to upgrade old nodes with 0.56.4 when nodown is set because It will cause to stalled waiting operaions, am I right or not?

I use pool size 3 for all of the pools, but without nodown the system kicks off even 3 replicas simultaneously.

So the question is how the situation can be solved.

Actions #1

Updated by Ivan Kudryavtsev almost 11 years ago

I upgraded full cluster to
new: ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)

but it still flapping with nodown cleared.

Actions #2

Updated by Ivan Kudryavtsev almost 11 years ago

During upgrade I restarted services on all nodes.

Actions #3

Updated by Ivan Kudryavtsev almost 11 years ago

I found networking bug (not full connectivity). Ticket could be closed.
The reason was that new osd host was unable to heartbeat neighbors, but neighbours were able to heartbeat it, so it started to flap.

Actions #4

Updated by Sage Weil almost 11 years ago

  • Status changed from New to Rejected

Note that the current development releases include more robust heartbeat checks and a backoff behavior that prevents this sort of partial network failure from causing flapping. It's too large a change to backport to cuttlefish, however. Until dumpling it is something users just need to watch out for.

Closing this out!

Actions

Also available in: Atom PDF