Project

General

Profile

Bug #36244

Updated by Joao Eduardo Luis over 5 years ago

i use multisite with 2 zone,both zone with 1 rgw,after i add 1 rgw for each zone,all the mgr are crash,after restart mgr service,the mgr still crash,didn't work any more. 


 ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable) 


 <pre> ``` 
 -39> 2018-09-28 10:37:46.772706 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.22 100.97.8.124:6804/20375 4 ==== mgrreport(osd.22 +0-0 packed 742 osd_metrics=1) v5 ==== 784+0+0 (1233950104 0 0) 0x5652d3c24100 con 0x5652d37d9000 
    -38> 2018-09-28 10:37:46.772714 7f3dde666700    4 mgr.server handle_report from 0x5652d37d9000 osd,22 
    -37> 2018-09-28 10:37:46.772716 7f3dde666700 20 mgr.server handle_report updating existing DaemonState for osd,22 
    -36> 2018-09-28 10:37:46.772713 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.124:6810/21991 conn(0x5652d3817800 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=739 cs=1 l=1). rx osd.25 seq 4 0x5652d3c243c0 mgrreport(osd.25 +0-0 packed 742 osd_metrics=1) v5 
    -35> 2018-09-28 10:37:46.772718 7f3dde666700 20 mgr update loading 0 new types, 0 old types, had 129 types, got 742 bytes of data 
    -34> 2018-09-28 10:37:46.772753 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.25 100.97.8.124:6810/21991 4 ==== mgrreport(osd.25 +0-0 packed 742 osd_metrics=1) v5 ==== 784+0+0 (2973115570 0 0) 0x5652d3c243c0 con 0x5652d3817800 
    -33> 2018-09-28 10:37:46.772761 7f3dde666700    4 mgr.server handle_report from 0x5652d3817800 osd,25 
    -32> 2018-09-28 10:37:46.772763 7f3dde666700 20 mgr.server handle_report updating existing DaemonState for osd,25 
    -31> 2018-09-28 10:37:46.772765 7f3dde666700 20 mgr update loading 0 new types, 0 old types, had 129 types, got 742 bytes of data 
    -30> 2018-09-28 10:37:46.773302 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.124:6804/20375 conn(0x5652d37d9000 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=763 cs=1 l=1). rx osd.22 seq 5 0x5652d3c24680 pg_stats(50 pgs tid 0 v 0) v1 
    -29> 2018-09-28 10:37:46.773376 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.22 100.97.8.124:6804/20375 5 ==== pg_stats(50 pgs tid 0 v 0) v1 ==== 30008+0+0 (12111768 0 0) 0x5652d3c24680 con 0x5652d37d9000 
    -28> 2018-09-28 10:37:46.773468 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.123:6816/21859 conn(0x5652d384d800 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1335 cs=1 l=1). rx osd.18 seq 4 0x5652d3c17440 mgrreport(osd.18 +0-0 packed 742 osd_metrics=1) v5 
    -27> 2018-09-28 10:37:46.773487 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.18 100.97.8.123:6816/21859 4 ==== mgrreport(osd.18 +0-0 packed 742 osd_metrics=1) v5 ==== 784+0+0 (156505875 0 0) 0x5652d3c17440 con 0x5652d384d800 
    -26> 2018-09-28 10:37:46.773498 7f3dde666700    4 mgr.server handle_report from 0x5652d384d800 osd,18 
    -25> 2018-09-28 10:37:46.773512 7f3dde666700 20 mgr.server handle_report updating existing DaemonState for osd,18 
    -24> 2018-09-28 10:37:46.773515 7f3dde666700 20 mgr update loading 0 new types, 0 old types, had 129 types, got 742 bytes of data 
    -23> 2018-09-28 10:37:46.773604 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.123:6816/21859 conn(0x5652d384d800 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1335 cs=1 l=1). rx osd.18 seq 5 0x5652d3c17700 pg_stats(14 pgs tid 0 v 0) v1 
    -22> 2018-09-28 10:37:46.773621 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.18 100.97.8.123:6816/21859 5 ==== pg_stats(14 pgs tid 0 v 0) v1 ==== 8508+0+0 (948072078 0 0) 0x5652d3c17700 con 0x5652d384d800 
    -21> 2018-09-28 10:37:46.773704 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.124:6810/21991 conn(0x5652d3817800 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=739 cs=1 l=1). rx osd.25 seq 5 0x5652d3d28100 pg_stats(45 pgs tid 0 v 0) v1 
    -20> 2018-09-28 10:37:46.773717 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.25 100.97.8.124:6810/21991 5 ==== pg_stats(45 pgs tid 0 v 0) v1 ==== 27034+0+0 (1702851695 0 0) 0x5652d3d28100 con 0x5652d3817800 
    -19> 2018-09-28 10:37:46.773731 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.123:6810/20319 conn(0x5652d38f3000 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=738 cs=1 l=1). rx osd.15 seq 4 0x5652d3e18100 mgrreport(osd.15 +0-0 packed 742 osd_metrics=1) v5 
    -18> 2018-09-28 10:37:46.773777 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.15 100.97.8.123:6810/20319 4 ==== mgrreport(osd.15 +0-0 packed 742 osd_metrics=1) v5 ==== 784+0+0 (2683331780 0 0) 0x5652d3e18100 con 0x5652d38f3000 
    -17> 2018-09-28 10:37:46.773784 7f3dde666700    4 mgr.server handle_report from 0x5652d38f3000 osd,15 
    -16> 2018-09-28 10:37:46.773787 7f3dde666700 20 mgr.server handle_report updating existing DaemonState for osd,15 
    -15> 2018-09-28 10:37:46.773789 7f3dde666700 20 mgr update loading 0 new types, 0 old types, had 129 types, got 742 bytes of data 
    -14> 2018-09-28 10:37:46.774280 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.123:6810/20319 conn(0x5652d38f3000 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=738 cs=1 l=1). rx osd.15 seq 5 0x5652d3e183c0 pg_stats(53 pgs tid 0 v 0) v1 
    -13> 2018-09-28 10:37:46.774304 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.15 100.97.8.123:6810/20319 5 ==== pg_stats(53 pgs tid 0 v 0) v1 ==== 31806+0+0 (3567068761 0 0) 0x5652d3e183c0 con 0x5652d38f3000 
    -12> 2018-09-28 10:37:46.774751 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.123:6800/17601 conn(0x5652d383d000 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=738 cs=1 l=1). rx osd.10 seq 4 0x5652d3b939c0 mgrreport(osd.10 +0-0 packed 742 osd_metrics=1) v5 
    -11> 2018-09-28 10:37:46.774770 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.10 100.97.8.123:6800/17601 4 ==== mgrreport(osd.10 +0-0 packed 742 osd_metrics=1) v5 ==== 784+0+0 (1173938277 0 0) 0x5652d3b939c0 con 0x5652d383d000 
    -10> 2018-09-28 10:37:46.774778 7f3dde666700    4 mgr.server handle_report from 0x5652d383d000 osd,10 
     -9> 2018-09-28 10:37:46.774780 7f3dde666700 20 mgr.server handle_report updating existing DaemonState for osd,10 
     -8> 2018-09-28 10:37:46.774782 7f3dde666700 20 mgr update loading 0 new types, 0 old types, had 129 types, got 742 bytes of data 
     -7> 2018-09-28 10:37:46.775016 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.123:6800/17601 conn(0x5652d383d000 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=738 cs=1 l=1). rx osd.10 seq 5 0x5652d3b93c80 pg_stats(65 pgs tid 0 v 0) v1 
     -6> 2018-09-28 10:37:46.775034 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== osd.10 100.97.8.123:6800/17601 5 ==== pg_stats(65 pgs tid 0 v 0) v1 ==== 38938+0+0 (804497348 0 0) 0x5652d3b93c80 con 0x5652d383d000 
     -5> 2018-09-28 10:37:46.781888 7f3deb39c700    5 -- 100.97.8.131:6800/1265072 >> 100.97.8.124:0/2791165959 conn(0x5652d39f5800 :6800 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=222 cs=1 l=1). rx client.6381 seq 3 0x5652d409f440 mgrreport(rgw.cn-bj-test2 +0-0 packed 214) v5 
     -4> 2018-09-28 10:37:46.781946 7f3dde666700    1 -- 100.97.8.131:6800/1265072 <== client.6381 100.97.8.124:0/2791165959 3 ==== mgrreport(rgw.cn-bj-test2 +0-0 packed 214) v5 ==== 253+0+0 (1062603950 0 0) 0x5652d409f440 con 0x5652d39f5800 
     -3> 2018-09-28 10:37:46.781962 7f3dde666700    4 mgr.server handle_report from 0x5652d39f5800 rgw,cn-bj-test2 
     -2> 2018-09-28 10:37:46.781966 7f3dde666700 20 mgr.server handle_report updating existing DaemonState for rgw,cn-bj-test2 
     -1> 2018-09-28 10:37:46.781968 7f3dde666700 20 mgr update loading 0 new types, 0 old types, had 129 types, got 214 bytes of data 
      0> 2018-09-28 10:37:46.783446 7f3dde666700 -1 *** Caught signal (Aborted) ** 
  in thread 7f3dde666700 thread_name:ms_dispatch 

  ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable) 
  1: (()+0x3f40c1) [0x5652c9b220c1] 
  2: (()+0xf6d0) [0x7f3df00026d0] 
  3: (gsignal()+0x37) [0x7f3def011277] 
  4: (abort()+0x148) [0x7f3def012968] 
  5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f3def920ac5] 
  6: (()+0x5ea36) [0x7f3def91ea36] 
  7: (()+0x5ea63) [0x7f3def91ea63] 
  8: (()+0x5ec83) [0x7f3def91ec83] 
  9: (std::__throw_out_of_range(char const*)+0x77) [0x7f3def973b47] 
  10: (DaemonPerfCounters::update(MMgrReport*)+0xb6c) [0x5652c99d09ec] 
  11: (DaemonServer::handle_report(MMgrReport*)+0x243) [0x5652c99d8903] 
  12: (DaemonServer::ms_dispatch(Message*)+0x47) [0x5652c99e4917] 
  13: (DispatchQueue::entry()+0x792) [0x5652c9e20cb2] 
  14: (DispatchQueue::DispatchThread::entry()+0xd) [0x5652c9c0abed] 
  15: (()+0x7e25) [0x7f3defffae25] 
  16: (clone()+0x6d) [0x7f3def0d9bad] 
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

 --- logging levels --- 
    0/ 5 none 
    0/ 1 lockdep 
    0/ 1 context 
    1/ 1 crush 
    1/ 5 mds 
    1/ 5 mds_balancer 
    1/ 5 mds_locker 
    1/ 5 mds_log 
    1/ 5 mds_log_expire 
    1/ 5 mds_migrator 
    0/ 1 buffer 
    0/ 1 timer 
    0/ 1 filer 
    0/ 1 striper 
    0/ 1 objecter 
    0/ 5 rados 
    0/ 5 rbd 
    0/ 5 rbd_mirror 
    0/ 5 rbd_replay 
    0/ 5 journaler 
    0/ 5 objectcacher 
    0/ 5 client 
    1/ 5 osd 
    0/ 5 optracker 
    0/ 5 objclass 
    1/ 3 filestore 
    1/ 3 journal 
    0/ 5 ms 
    1/ 5 mon 
    0/10 monc 
    1/ 5 paxos 
    0/ 5 tp 
    1/ 5 auth 
    1/ 5 crypto 
    1/ 1 finisher 
    1/ 1 reserver 
    1/ 5 heartbeatmap 
    1/ 5 perfcounter 
    1/ 5 rgw 
    1/10 civetweb 
    1/ 5 javaclient 
    1/ 5 asok 
    1/ 1 throttle 
    0/ 0 refs 
    1/ 5 xio 
    1/ 5 compressor 
    1/ 5 bluestore 
    1/ 5 bluefs 
    1/ 3 bdev 
    1/ 5 kstore 
    4/ 5 rocksdb 
    4/ 5 leveldb 
    4/ 5 memdb 
    1/ 5 kinetic 
    1/ 5 fuse 
   20/20 mgr 
    1/ 5 mgrc 
    1/ 5 dpdk 
    1/ 5 eventtrace 
   -2/-2 (syslog threshold) 
   -1/-1 (stderr threshold) 
   max_recent       10000 
   max_new           1000 
   log_file /var/log/ceph/ceph-mgr.JXQ-97-8-131.log 
 --- end dump of recent events --- 

 </pre> ```

Back