Project

General

Profile

Actions

Support #18479

closed

Kraken split-brain, OSD auth connection drops

Added by Rob Longstaff over 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Pull request ID:

Description

I'm having multiple issues after updating my cluster to Kraken 11.1.1 from 11.0. The cluster is composed of 20 nodes with 45 OSDs each and 3 monitors. After the upgrade, the monitors lost quorum and entered a split brain situation. I've worked around that for now by dropping to a single monitor. Reaching that point I had performed all the standard troubleshooting from the docs and even did a rolling rebuild of the monfs on mon.2 and mon.3. They joined, replicated, and then ended up split-brain again. mon.3 took the role of peon and showed mon.1 and mon.2 in the quorum, while mon.1 and mon.2 both thought they were the leader and only had themselves and mon.3 in their respective quorum groups.

The second issue is showing about 600 of my 900 OSDs as down. Some of the cluster refuses to register with the mons. Their debug logs show that the OSDs' auth connections are being dropped - but other requests are not. The monitors log hints that the connection close is unexpected there as well, leaving 'malformed requests' in the logs. I've tried 2 different mon hosts with the same results.

[ mon]# rpm -qa|grep ceph
ceph-common-11.1.1-0.el7.x86_64
ceph-osd-11.1.1-0.el7.x86_64
libcephfs2-11.1.1-0.el7.x86_64
python-cephfs-11.1.1-0.el7.x86_64
ceph-selinux-11.1.1-0.el7.x86_64
ceph-mon-11.1.1-0.el7.x86_64
ceph-mgr-11.1.1-0.el7.x86_64
ceph-11.1.1-0.el7.x86_64
ceph-zabbix-0.0.2-1.x86_64
ceph-base-11.1.1-0.el7.x86_64
ceph-mds-11.1.1-0.el7.x86_64

======

cluster 8e43a3e9-483b-4310-849c-973a929c9b0e
health HEALTH_ERR
8286 pgs are stuck inactive for more than 300 seconds
8286 pgs stale
8286 pgs stuck stale
too many PGs per OSD (513 > max 300)
all OSDs are running kraken or later but the 'require_kraken_osds' osdmap flag is not set
monmap e3: 1 mons at {ceph-mon01=10.24.44.12:6789/0}
election epoch 3852, quorum 0 ceph-mon01
mgr active: ceph-mon01
osdmap e42376: 900 osds: 269 up, 269 in; 6599 remapped pgs
flags sortbitwise
pgmap v13330951: 11904 pgs, 21 pools, 14017 GB data, 495 kobjects
7276 GB used, 1949 TB / 1957 TB avail
8286 stale+active+clean
3618 active+clean

=====OSD=====

2017-01-10 20:18:21.673074 7f94833ea940 10 monclient(hunting): auth_supported 2 method cephx [653/1847]
2017-01-10 20:18:21.673390 7f94833ea940 20 log_channel(cluster) update_config log_to_monitors {default=true} log_to_syslog {default=false} log_channels {audit=local0,default=daemon} log_prios {default=info}
2017-01-10 20:18:21.673402 7f94833ea940 10 log_channel(cluster) update_config to_monitors: true to_syslog: false syslog_facility: daemon prio: info to_graylog: false graylog_host: 127.0.0.1 graylog_port: 12201)
2017-01-10 20:18:21.673408 7f94833ea940 -1 osd.0 42376 log_to_monitors {default=true}
2017-01-10 20:18:21.674267 7f94833ea940 10 monclient(hunting): _reopen_session rank -1 name
2017-01-10 20:18:21.674304 7f94833ea940 10 monclient(hunting): picked mon.noname-a con 0x5579f2bf8800 addr 10.24.44.12:6789/0
2017-01-10 20:18:21.674319 7f94833ea940 10 monclient(hunting): _send_mon_message to mon.noname-a at 10.24.44.12:6789/0
2017-01-10 20:18:21.674333 7f94833ea940 10 monclient(hunting): renew_subs
2017-01-10 20:18:21.675416 7f9471d3b700 10 monclient(hunting): handle_monmap mon_map magic: 0 v1
2017-01-10 20:18:21.675455 7f9471d3b700 10 monclient(hunting): got monmap 3, mon.noname-a is now rank -1
2017-01-10 20:18:21.675461 7f9471d3b700 10 monclient(hunting): dump:
epoch 3
fsid 8e43a3e9-483b-4310-849c-973a929c9b0e
last_changed 2017-01-09 22:58:45.583214
created 0.000000
0: 10.24.44.12:6789/0 mon.ceph-mon01

2017-01-10 20:18:21.675530 7f9471d3b700 10 monclient(hunting): my global_id is 55885297
2017-01-10 20:18:21.675635 7f9471d3b700 10 monclient(hunting): _send_mon_message to mon.ceph-mon01 at 10.24.44.12:6789/0
2017-01-10 20:18:21.676275 7f9471d3b700 10 monclient(hunting): _send_mon_message to mon.ceph-mon01 at 10.24.44.12:6789/0
2017-01-10 20:18:21.677000 7f9471d3b700 1 monclient(hunting): found mon.ceph-mon01
2017-01-10 20:18:21.677006 7f9471d3b700 10 monclient: _send_mon_message to mon.ceph-mon01 at 10.24.44.12:6789/0
2017-01-10 20:18:21.677038 7f9471d3b700 10 monclient: _check_auth_rotating renewing rotating keys (they expired before 2017-01-10 20:17:51.677037)
2017-01-10 20:18:21.677051 7f9471d3b700 10 monclient: _send_mon_message to mon.ceph-mon01 at 10.24.44.12:6789/0
2017-01-10 20:18:21.677146 7f94833ea940 5 monclient: authenticate success, global_id 55885297
2017-01-10 20:18:21.677162 7f94833ea940 10 monclient: wait_auth_rotating waiting (until 2017-01-10 20:18:51.677161)
2017-01-10 20:18:21.677426 7f9471d3b700 10 monclient: handle_monmap mon_map magic: 0 v1
2017-01-10 20:18:21.677447 7f9471d3b700 10 monclient: got monmap 3, mon.ceph-mon01 is now rank 0
2017-01-10 20:18:21.677533 7f9471d3b700 10 monclient: dump:
epoch 3
fsid 8e43a3e9-483b-4310-849c-973a929c9b0e
last_changed 2017-01-09 22:58:45.583214
created 0.000000
0: 10.24.44.12:6789/0 mon.ceph-mon01

2017-01-10 20:18:21.677662 7f9471d3b700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2017-01-10 20:17:51.677661)
2017-01-10 20:18:21.677711 7f94833ea940 10 monclient: wait_auth_rotating done
2017-01-10 20:18:21.677768 7f94833ea940 10 monclient: _send_command 1 [{"prefix": "osd crush create-or-move", "id": 0, "weight":7.2754, "args": ["host=ceph001", "root=default"]}]
2017-01-10 20:18:21.677778 7f94833ea940 10 monclient: _send_mon_message to mon.ceph-mon01 at 10.24.44.12:6789/0
2017-01-10 20:18:21.679307 7f9471d3b700 10 monclient: handle_mon_command_ack 1 [{"prefix": "osd crush create-or-move", "id": 0, "weight":7.2754, "args": ["host=ceph001", "root=default"]}]
2017-01-10 20:18:21.679311 7f9471d3b700 10 monclient: _finish_command 1 = 0 create-or-move updated item name 'osd.0' weight 7.2754 at location {host=ceph001,root=default} to crush map
2017-01-10 20:18:22.473174 7f947d7fe700 0 -- 10.24.44.13:6808/1124211 >> - conn(0x5579f5182000 :6808 s=STATE_ACCEPTING_WAIT_BANNER_ADDR pgs=0 cs=0 l=0).fault with nothing to send and in the half accept state just closed
2017-01-10 20:18:24.023565 7f94833ea940 0 osd.0 42376 done with init, starting boot process

===== MON =====
2017-01-10 20:59:00.361768 7f8e41b71700 0 mon.ceph-mon01@0(leader).auth v14492 caught error when trying to handle auth request, probably malformed request
2017-01-10 20:59:00.453098 7f8e41b71700 0 mon.ceph-mon01@0(leader).auth v14492 caught error when trying to handle auth request, probably malformed request
2017-01-10 20:59:01.798183 7f8e41b71700 0 mon.ceph-mon01@0(leader).auth v14492 caught error when trying to handle auth request, probably malformed request
2017-01-10 20:59:03.883135 7f8e41b71700 0 mon.ceph-mon01@0(leader).auth v14492 caught error when trying to handle auth request, probably malformed request
2017-01-10 20:59:09.582994 7f8e41b71700 0 mon.ceph-mon01@0(leader).auth v14492 caught error when trying to handle auth request, probably malformed request

Actions #1

Updated by Greg Farnum almost 7 years ago

  • Tracker changed from Bug to Support
  • Status changed from New to Closed

Well, all those issues probably tie back to the first one where the monitors are disagreeing about the quorum state. Which on the whole is pretty much impossible, so means either they were dramatically misconfigured or else that there were election storms rolling around and they were cycling through leadership. Possibly cluster thrashing or something.

Actions

Also available in: Atom PDF