2 of the mons were still on jewel and that seems to be issue with systemd even when the packages were at 12.1( I will raise a separate issue for systemd),
[ubuntu@vpm089 ~]$ sudo ceph daemon mon.vpm089 version
{"version":"10.2.9"}[ubuntu@vpm089 ~]$ rpm -qa | grep ceph
libcephfs2-12.1.1-0.el7.x86_64
ceph-mon-12.1.1-0.el7.x86_64
iozone-3.424-2_ceph.el7.centos.x86_64
ceph-selinux-12.1.1-0.el7.x86_64
ceph-test-12.1.1-0.el7.x86_64
ceph-release-1-1.el7.noarch
python-cephfs-12.1.1-0.el7.x86_64
ceph-mds-12.1.1-0.el7.x86_64
ceph-radosgw-12.1.1-0.el7.x86_64
ceph-common-12.1.1-0.el7.x86_64
ceph-osd-12.1.1-0.el7.x86_64
ceph-mgr-12.1.1-0.el7.x86_64
ceph-base-12.1.1-0.el7.x86_64
ceph-12.1.1-0.el7.x86_64
mod_fastcgi-2.4.7-1.ceph.el7.centos.x86_64
[ubuntu@vpm005 ~]$ sudo ceph daemon mon.vpm005 version
{"version":"10.2.9"}
[ubuntu@vpm161 ~]$ sudo ceph daemon mon.vpm161 version
{"version":"12.1.1","release":"luminous","release_type":"rc"}[ubuntu@vpm161 ~]$
After restarting both the mons and verifying that they were on 12.1.1 and all mon's were up, i reissued the mgr create but this time i got a crash and all 3 mons crashed
[ubuntu@vpm005 ~]$ sudo ceph -s
cluster:
id: 76b054f1-989f-4dab-983b-6cbe87eb5c2f
health: HEALTH_ERR
66 pgs are stuck inactive for more than 60 seconds
100 pgs degraded
100 pgs stuck degraded
66 pgs stuck inactive
100 pgs stuck unclean
100 pgs stuck undersized
100 pgs undersized
3 requests are blocked > 32 sec
recovery 36/60 objects degraded (60.000%)
recovery 4/60 objects misplaced (6.667%)
mds cluster is degraded
services:
mon: 3 daemons, quorum vpm005,vpm089,vpm161
mgr: no daemons active
mds: 1/1/1 up {0=vpm089=up:replay}
osd: 9 osds: 3 up, 3 in; 100 remapped pgs
data:
pools: 3 pools, 100 pgs
objects: 20 objects, 2068 bytes
usage: 104 MB used, 584 GB / 584 GB avail
pgs: 66.000% pgs not active
36/60 objects degraded (60.000%)
4/60 objects misplaced (6.667%)
66 undersized+degraded+peered
34 active+undersized+degraded
[ubuntu@vpm089 ceph-deploy]$ ./ceph-deploy mgr create vpm161
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ubuntu/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.38): ./ceph-deploy mgr create vpm161
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] mgr : [('vpm161', 'vpm161')]
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : create
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x16d9fc8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] func : <function mgr at 0x1668410>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts vpm161:vpm161
Warning: Permanently added 'vpm161,172.21.2.161' (ECDSA) to the list of known hosts.
[vpm161][DEBUG ] connection detected need for sudo
Warning: Permanently added 'vpm161,172.21.2.161' (ECDSA) to the list of known hosts.
[vpm161][DEBUG ] connected to host: vpm161
[vpm161][DEBUG ] detect platform information from remote host
[vpm161][DEBUG ] detect machine type
[ceph_deploy.mgr][INFO ] Distro info: CentOS Linux 7.3.1611 Core
[ceph_deploy.mgr][DEBUG ] remote host will use systemd
[ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to vpm161
[vpm161][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[vpm161][DEBUG ] create path if it doesn't exist
[vpm161][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.vpm161 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-vpm161/keyring
[vpm161][WARNIN] No data was received after 300 seconds, disconnecting...
[vpm161][INFO ] Running command: sudo systemctl enable ceph-mgr@vpm161
[vpm161][WARNIN] Created symlink from /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@vpm161.service to /usr/lib/systemd/system/ceph-mgr@.service.
[vpm161][INFO ] Running command: sudo systemctl start ceph-mgr@vpm161
[vpm161][INFO ] Running command: sudo systemctl enable ceph.target
The crash seen in monitor looks like:
-38> 2017-08-01 19:48:36.532727 7fe3d70a3700 1 -- 172.21.2.89:6789/0 _send_message--> mon.2 172.21.2.161:6789/0 -- paxos(lease lc 118352 fc 117720 pn 0 opn 0) v4 -- ?+0 0x7fe3e876f000
-37> 2017-08-01 19:48:36.532737 7fe3d70a3700 1 -- 172.21.2.89:6789/0 --> 172.21.2.161:6789/0 -- paxos(lease lc 118352 fc 117720 pn 0 opn 0) v4 -- 0x7fe3e876f000 con 0
-36> 2017-08-01 19:48:36.543756 7fe3d427e700 5 mon.vpm089@1(leader).paxos(paxos active c 117720..118352) queue_pending_finisher 0x7fe3e8592950
-35> 2017-08-01 19:48:36.545835 7fe3d427e700 1 -- 172.21.2.89:6789/0 _send_message--> mon.2 172.21.2.161:6789/0 -- paxos(begin lc 118352 fc 0 pn 5866101 opn 0) v4 -- ?+0 0x7fe3e876f900
-34> 2017-08-01 19:48:36.545851 7fe3d427e700 1 -- 172.21.2.89:6789/0 --> 172.21.2.161:6789/0 -- paxos(begin lc 118352 fc 0 pn 5866101 opn 0) v4 -- 0x7fe3e876f900 con 0
-33> 2017-08-01 19:48:36.549450 7fe3d6691700 5 -- 172.21.2.89:6789/0 >> 172.21.2.161:6789/0 conn(0x7fe3e89bd000 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=3 cs=1 l=0). rx mon.2 seq 17 0x7fe3e876f000 paxos(lease_ack lc 118352 fc 117720 pn 0 opn 0) v4
-32> 2017-08-01 19:48:36.549514 7fe3d1a79700 1 -- 172.21.2.89:6789/0 <== mon.2 172.21.2.161:6789/0 17 ==== paxos(lease_ack lc 118352 fc 117720 pn 0 opn 0) v4 ==== 166+0+0 (2034569483 0 0) 0x7fe3e876f000 con 0x7fe3e89bd000
-31> 2017-08-01 19:48:36.583151 7fe3d6691700 1 -- 172.21.2.89:6789/0 >> 172.21.2.161:6789/0 conn(0x7fe3e89bd000 :-1 s=STATE_OPEN pgs=3 cs=1 l=0).read_bulk peer close file descriptor 29
-30> 2017-08-01 19:48:36.583166 7fe3d6691700 1 -- 172.21.2.89:6789/0 >> 172.21.2.161:6789/0 conn(0x7fe3e89bd000 :-1 s=STATE_OPEN pgs=3 cs=1 l=0).read_until read failed
-29> 2017-08-01 19:48:36.583170 7fe3d6691700 1 -- 172.21.2.89:6789/0 >> 172.21.2.161:6789/0 conn(0x7fe3e89bd000 :-1 s=STATE_OPEN pgs=3 cs=1 l=0).process read tag failed
-28> 2017-08-01 19:48:36.583386 7fe3d5e90700 1 -- 172.21.2.89:6789/0 >> - conn(0x7fe3e8957000 :6789 s=STATE_ACCEPTING pgs=0 cs=0 l=0)._process_connection sd=28 -
-27> 2017-08-01 19:48:36.583517 7fe3d5e90700 2 -- 172.21.2.89:6789/0 >> 172.21.2.89:6800/2251934460 conn(0x7fe3e8957000 :6789 s=STATE_ACCEPTING_WAIT_SEQ pgs=93597 cs=1 l=1).handle_connect_msg accept write reply msg done
-26> 2017-08-01 19:48:36.583740 7fe3d5e90700 2 -- 172.21.2.89:6789/0 >> 172.21.2.89:6800/2251934460 conn(0x7fe3e8957000 :6789 s=STATE_ACCEPTING_WAIT_SEQ pgs=93597 cs=1 l=1)._process_connection accept get newly_acked_seq 0
-25> 2017-08-01 19:48:36.583778 7fe3d5e90700 5 -- 172.21.2.89:6789/0 >> 172.21.2.89:6800/2251934460 conn(0x7fe3e8957000 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=93597 cs=1 l=1). rx mds.0 seq 1 0x7fe3e89c9e00 auth(proto 0 31 bytes epoch 2) v1
-24> 2017-08-01 19:48:36.583817 7fe3d1a79700 1 -- 172.21.2.89:6789/0 <== mds.0 172.21.2.89:6800/2251934460 1 ==== auth(proto 0 31 bytes epoch 2) v1 ==== 61+0+0 (732538964 0 0) 0x7fe3e89c9e00 con 0x7fe3e8957000
-23> 2017-08-01 19:48:36.583836 7fe3d1a79700 5 mon.vpm089@1(leader).paxos(paxos updating c 117720..118352) is_readable = 1 - now=2017-08-01 19:48:36.583837 lease_expire=2017-08-01 19:48:41.532724 has v0 lc 118352
-22> 2017-08-01 19:48:36.583877 7fe3d1a79700 2 mon.vpm089@1(leader) e2 send_reply 0x7fe3e85d0640 0x7fe3e89c9b80 auth_reply(proto 2 0 (0) Success) v1
-21> 2017-08-01 19:48:36.583885 7fe3d1a79700 1 -- 172.21.2.89:6789/0 --> 172.21.2.89:6800/2251934460 -- auth_reply(proto 2 0 (0) Success) v1 -- 0x7fe3e89c9b80 con 0
-20> 2017-08-01 19:48:36.584153 7fe3d5e90700 5 -- 172.21.2.89:6789/0 >> 172.21.2.89:6800/2251934460 conn(0x7fe3e8957000 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=93597 cs=1 l=1). rx mds.0 seq 2 0x7fe3e89c9b80 auth(proto 2 128 bytes epoch 0) v1
-19> 2017-08-01 19:48:36.584183 7fe3d1a79700 1 -- 172.21.2.89:6789/0 <== mds.0 172.21.2.89:6800/2251934460 2 ==== auth(proto 2 128 bytes epoch 0) v1 ==== 158+0+0 (1065591626 0 0) 0x7fe3e89c9b80 con 0x7fe3e8957000
-18> 2017-08-01 19:48:36.584197 7fe3d1a79700 5 mon.vpm089@1(leader).paxos(paxos updating c 117720..118352) is_readable = 1 - now=2017-08-01 19:48:36.584198 lease_expire=2017-08-01 19:48:41.532724 has v0 lc 118352
-17> 2017-08-01 19:48:36.584371 7fe3d1a79700 2 mon.vpm089@1(leader) e2 send_reply 0x7fe3e85d0640 0x7fe3e89c9e00 auth_reply(proto 2 0 (0) Success) v1
-16> 2017-08-01 19:48:36.584380 7fe3d1a79700 1 -- 172.21.2.89:6789/0 --> 172.21.2.89:6800/2251934460 -- auth_reply(proto 2 0 (0) Success) v1 -- 0x7fe3e89c9e00 con 0
-15> 2017-08-01 19:48:36.584629 7fe3d5e90700 5 -- 172.21.2.89:6789/0 >> 172.21.2.89:6800/2251934460 conn(0x7fe3e8957000 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=93597 cs=1 l=1). rx mds.0 seq 3 0x7fe3e8a1ad80 mon_subscribe({mdsmap=9+,monmap=3+,osdmap=56}) v2
-14> 2017-08-01 19:48:36.584670 7fe3d1a79700 1 -- 172.21.2.89:6789/0 <== mds.0 172.21.2.89:6800/2251934460 3 ==== mon_subscribe({mdsmap=9+,monmap=3+,osdmap=56}) v2 ==== 61+0+0 (3973911011 0 0) 0x7fe3e8a1ad80 con 0x7fe3e8957000
-13> 2017-08-01 19:48:37.686897 7fe3d568f700 1 -- 172.21.2.89:6789/0 >> 172.21.2.5:6789/0 conn(0x7fe3e89bb800 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection reconnect failed
-12> 2017-08-01 19:48:37.686937 7fe3d568f700 2 -- 172.21.2.89:6789/0 >> 172.21.2.5:6789/0 conn(0x7fe3e89bb800 :-1 s=STATE_CONNECTING_RE pgs=0 cs=0 l=0)._process_connection connection refused!
-11> 2017-08-01 19:48:39.213027 7fe3d5e90700 1 -- 172.21.2.89:6789/0 >> - conn(0x7fe3e8abd000 :6789 s=STATE_ACCEPTING pgs=0 cs=0 l=0)._process_connection sd=29 -
-10> 2017-08-01 19:48:39.213587 7fe3d5e90700 2 -- 172.21.2.89:6789/0 >> 172.21.2.161:0/2298754901 conn(0x7fe3e8abd000 :6789 s=STATE_ACCEPTING_WAIT_SEQ pgs=162 cs=1 l=1).handle_connect_msg accept write reply msg done
-9> 2017-08-01 19:48:39.214190 7fe3d5e90700 2 -- 172.21.2.89:6789/0 >> 172.21.2.161:0/2298754901 conn(0x7fe3e8abd000 :6789 s=STATE_ACCEPTING_WAIT_SEQ pgs=162 cs=1 l=1)._process_connection accept get newly_acked_seq 0
-8> 2017-08-01 19:48:39.214280 7fe3d5e90700 5 -- 172.21.2.89:6789/0 >> 172.21.2.161:0/2298754901 conn(0x7fe3e8abd000 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=162 cs=1 l=1). rx client.? seq 1 0x7fe3e89c9e00 auth(proto 0 38 bytes epoch 2) v1
-7> 2017-08-01 19:48:39.214323 7fe3d1a79700 1 -- 172.21.2.89:6789/0 <== client.? 172.21.2.161:0/2298754901 1 ==== auth(proto 0 38 bytes epoch 2) v1 ==== 68+0+0 (3411313417 0 0) 0x7fe3e89c9e00 con 0x7fe3e8abd000
-6> 2017-08-01 19:48:39.214347 7fe3d1a79700 5 mon.vpm089@1(leader).paxos(paxos updating c 117720..118352) is_readable = 1 - now=2017-08-01 19:48:39.214348 lease_expire=2017-08-01 19:48:41.532724 has v0 lc 118352
-5> 2017-08-01 19:48:39.214394 7fe3d1a79700 2 mon.vpm089@1(leader) e2 send_reply 0x7fe3e85d0640 0x7fe3e89c9b80 auth_reply(proto 2 0 (0) Success) v1
-4> 2017-08-01 19:48:39.214408 7fe3d1a79700 1 -- 172.21.2.89:6789/0 --> 172.21.2.161:0/2298754901 -- auth_reply(proto 2 0 (0) Success) v1 -- 0x7fe3e89c9b80 con 0
-3> 2017-08-01 19:48:39.215481 7fe3d5e90700 5 -- 172.21.2.89:6789/0 >> 172.21.2.161:0/2298754901 conn(0x7fe3e8abd000 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=162 cs=1 l=1). rx client.? seq 2 0x7fe3e89c9b80 auth(proto 2 32 bytes epoch 0) v1
-2> 2017-08-01 19:48:39.215514 7fe3d1a79700 1 -- 172.21.2.89:6789/0 <== client.? 172.21.2.161:0/2298754901 2 ==== auth(proto 2 32 bytes epoch 0) v1 ==== 62+0+0 (2204999250 0 0) 0x7fe3e89c9b80 con 0x7fe3e8abd000
-1> 2017-08-01 19:48:39.215529 7fe3d1a79700 5 mon.vpm089@1(leader).paxos(paxos updating c 117720..118352) is_readable = 1 - now=2017-08-01 19:48:39.215530 lease_expire=2017-08-01 19:48:41.532724 has v0 lc 118352
0> 2017-08-01 19:48:39.219115 7fe3d1a79700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/auth/Crypto.h: In function 'int CryptoKey::encrypt(CephContext*, const bufferlist&, ceph::bufferlist&, std::string*) const' thread 7fe3d1a79700 time 2017-08-01 19:48:39.215549
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/auth/Crypto.h: 109: FAILED assert(ckh)
ceph version 12.1.1 (f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7fe3de2b7310]
2: (()+0x2c50f8) [0x7fe3ddfb70f8]
3: (cephx_calc_client_server_challenge(CephContext*, CryptoKey&, unsigned long, unsigned long, unsigned long*, std::string&)+0x2f5) [0x7fe3de46c055]
4: (CephxServiceHandler::handle_request(ceph::buffer::list::iterator&, ceph::buffer::list&, unsigned long&, AuthCapsInfo&, unsigned long*)+0x259c) [0x7fe3de27bf7c]
5: (AuthMonitor::prep_auth(boost::intrusive_ptr<MonOpRequest>, bool)+0xc24) [0x7fe3de0ea644]
6: (AuthMonitor::preprocess_query(boost::intrusive_ptr<MonOpRequest>)+0x322) [0x7fe3de0ed162]
7: (PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x811) [0x7fe3de1b5091]
8: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x151) [0x7fe3de09d661]
9: (Monitor::_ms_dispatch(Message*)+0x7de) [0x7fe3de09f06e]
10: (Monitor::ms_dispatch(Message*)+0x23) [0x7fe3de0c7303]
11: (DispatchQueue::entry()+0x792) [0x7fe3de4ef812]
12: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fe3de35d3cd]
13: (()+0x7dc5) [0x7fe3dd081dc5]
14: (clone()+0x6d) [0x7fe3da40d76d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 kinetic
1/ 5 fuse
1/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-mon.vpm089.log
--- end dump of recent events ---
2017-08-01 19:48:39.223796 7fe3d1a79700 -1 *** Caught signal (Aborted) **
in thread 7fe3d1a79700 thread_name:ms_dispatch
ceph version 12.1.1 (f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)
1: (()+0x852ec1) [0x7fe3de544ec1]
2: (()+0xf370) [0x7fe3dd089370]
3: (gsignal()+0x37) [0x7fe3da34b1d7]
4: (abort()+0x148) [0x7fe3da34c8c8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x7fe3de2b7484]
6: (()+0x2c50f8) [0x7fe3ddfb70f8]
7: (cephx_calc_client_server_challenge(CephContext*, CryptoKey&, unsigned long, unsigned long, unsigned long*, std::string&)+0x2f5) [0x7fe3de46c055]
8: (CephxServiceHandler::handle_request(ceph::buffer::list::iterator&, ceph::buffer::list&, unsigned long&, AuthCapsInfo&, unsigned long*)+0x259c) [0x7fe3de27bf7c]
9: (AuthMonitor::prep_auth(boost::intrusive_ptr<MonOpRequest>, bool)+0xc24) [0x7fe3de0ea644]
10: (AuthMonitor::preprocess_query(boost::intrusive_ptr<MonOpRequest>)+0x322) [0x7fe3de0ed162]
11: (PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x811) [0x7fe3de1b5091]
12: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x151) [0x7fe3de09d661]
13: (Monitor::_ms_dispatch(Message*)+0x7de) [0x7fe3de09f06e]
14: (Monitor::ms_dispatch(Message*)+0x23) [0x7fe3de0c7303]
15: (DispatchQueue::entry()+0x792) [0x7fe3de4ef812]
16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fe3de35d3cd]
17: (()+0x7dc5) [0x7fe3dd081dc5]
18: (clone()+0x6d) [0x7fe3da40d76d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- begin dump of recent events ---
0> 2017-08-01 19:48:39.223796 7fe3d1a79700 -1 *** Caught signal (Aborted) **
in thread 7fe3d1a79700 thread_name:ms_dispatch
ceph version 12.1.1 (f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)
1: (()+0x852ec1) [0x7fe3de544ec1]
2: (()+0xf370) [0x7fe3dd089370]
3: (gsignal()+0x37) [0x7fe3da34b1d7]
4: (abort()+0x148) [0x7fe3da34c8c8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x7fe3de2b7484]
6: (()+0x2c50f8) [0x7fe3ddfb70f8]
7: (cephx_calc_client_server_challenge(CephContext*, CryptoKey&, unsigned long, unsigned long, unsigned long*, std::string&)+0x2f5) [0x7fe3de46c055]
8: (CephxServiceHandler::handle_request(ceph::buffer::list::iterator&, ceph::buffer::list&, unsigned long&, AuthCapsInfo&, unsigned long*)+0x259c) [0x7fe3de27bf7c]
9: (AuthMonitor::prep_auth(boost::intrusive_ptr<MonOpRequest>, bool)+0xc24) [0x7fe3de0ea644]
10: (AuthMonitor::preprocess_query(boost::intrusive_ptr<MonOpRequest>)+0x322) [0x7fe3de0ed162]
11: (PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x811) [0x7fe3de1b5091]
12: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x151) [0x7fe3de09d661]
13: (Monitor::_ms_dispatch(Message*)+0x7de) [0x7fe3de09f06e]
14: (Monitor::ms_dispatch(Message*)+0x23) [0x7fe3de0c7303]
15: (DispatchQueue::entry()+0x792) [0x7fe3de4ef812]
16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fe3de35d3cd]
17: (()+0x7dc5) [0x7fe3dd081dc5]
18: (clone()+0x6d) [0x7fe3da40d76d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 kinetic
1/ 5 fuse
1/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-mon.vpm089.log
--- end dump of recent events ---
logs link:
vpm089 -> http://chunk.io/f/cfeaeb298bfc41d49055d4a74cfbc1ce ( last 5000 lines)
vpm005 -> http://chunk.io/f/2749112a9b0c40cf91007d8b8860ae60
vpm161 -> http://chunk.io/f/043c290f17cf41149dde6d5b46b7199d
Also feel free to login to those nodes, since some of the log files are more than 300MB+ for uploads.