Actions
Bug #20390
closedA bug in async+dpdk module
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I start the monitor daemon with async+dpdk, and execute "ceph -s" at other host.
after few tries, the monitor daemon crash.
-85> 2017-06-23 18:23:52.918071 7fffea3fe700 10 net dispatch_packet === rx === proto 806 48:ea:63:4:50:97 -> ff:ff:ff:ff:ff:ff length 60 -84> 2017-06-23 18:23:52.924118 7fffea3fe700 10 net dispatch_packet === rx === proto 806 48:ea:63:2c:72:a3 -> ff:ff:ff:ff:ff:ff length 60 -83> 2017-06-23 18:23:52.933680 7fffe53fa700 5 mon.iozone-103@0(leader).osd e1 can_mark_out no osds -82> 2017-06-23 18:23:52.933688 7fffe53fa700 5 mon.iozone-103@0(leader).paxos(paxos active c 1..108) is_readable = 1 - now=2017-06-23 18:23:52.933689 lease_expire=0.000000 has v0 lc 108 -81> 2017-06-23 18:23:52.968324 7fffea3fe700 10 net dispatch_packet === rx === proto 806 48:ea:63:2c:72:a3 -> ff:ff:ff:ff:ff:ff length 60 -80> 2017-06-23 18:23:52.969549 7fffea3fe700 10 net dispatch_packet === rx === proto 806 0:22:19:5d:1a:2d -> ff:ff:ff:ff:ff:ff length 60 -79> 2017-06-23 18:23:52.974851 7fffea3fe700 10 net dispatch_packet === rx === proto 806 48:ea:63:2d:a5:95 -> ff:ff:ff:ff:ff:ff length 60 -78> 2017-06-23 18:23:52.988551 7fffea3fe700 10 net dispatch_packet === rx === proto 800 f8:b1:56:be:be:66 -> ff:ff:ff:ff:ff:ff length 92 rss_hash 1945615894 -77> 2017-06-23 18:23:52.988555 7fffea3fe700 10 dpdk handle_received_packet get 11 packet from 191.168.22.98 -> 191.168.255.255 id=3625 ip_len=78 ip_hdr_len=20 pkt_len=78 offset=0 -76> 2017-06-23 18:23:52.995039 7fffea3fe700 10 net dispatch_packet === rx === proto 806 3c:e5:a6:27:16:86 -> ff:ff:ff:ff:ff:ff length 60 -75> 2017-06-23 18:23:53.002564 7fffea3fe700 10 net dispatch_packet === rx === proto 806 48:ea:63:4:80:5c -> ff:ff:ff:ff:ff:ff length 60 -74> 2017-06-23 18:23:53.011623 7fffea3fe700 10 net dispatch_packet === rx === proto 800 48:ea:63:5:72:31 -> 48:ea:63:5:72:5e length 191 rss_hash 1231707456 -73> 2017-06-23 18:23:53.011626 7fffea3fe700 10 dpdk handle_received_packet get 6 packet from 60.60.1.102 -> 60.60.1.103 id=16 ip_len=177 ip_hdr_len=20 pkt_len=177 offset=0 -72> 2017-06-23 18:23:53.011634 7fffea3fe700 20 dpdk handle_received_packet learn mac 48:ea:63:5:72:31 with 3c.3c.1.66 -71> 2017-06-23 18:23:53.011636 7fffea3fe700 20 tcp 60.60.1.103:6789 -> 60.60.1.102:50627 tcb(0x55555efc36c0 fd=2 s=ESTABLISHED).input_handle_other_state tcp header seq 719263620 ack 3762778290 snd next 3762778290 unack 3762778290 rcv next 719263620 len 137 fin=0 syn=0 -70> 2017-06-23 18:23:53.011640 7fffea3fe700 20 dpdk notify fd=2 mask=1 -69> 2017-06-23 18:23:53.011640 7fffea3fe700 20 dpdk notify activing=2 listening=1 waiting_idx=0 -68> 2017-06-23 18:23:53.011641 7fffea3fe700 20 dpdk notify activing=3 listening=1 waiting_idx=1 done -67> 2017-06-23 18:23:53.011642 7fffea3fe700 20 tcp 60.60.1.103:6789 -> 60.60.1.102:50627 tcb(0x55555efc36c0 fd=2 s=ESTABLISHED).input_handle_other_state merged=0 do_output=0 -66> 2017-06-23 18:23:53.011645 7fffea3fe700 10 net dispatch_packet === rx === proto 806 0:f:e2:11:54:83 -> ff:ff:ff:ff:ff:ff length 60 -65> 2017-06-23 18:23:53.011647 7fffea3fe700 20 dpdk poll fd=2 mask=1 -64> 2017-06-23 18:23:53.011648 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).process prev state is STATE_OPEN -63> 2017-06-23 18:23:53.011654 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_HEADER pgs=2 cs=1 l=1).process prev state is STATE_OPEN -62> 2017-06-23 18:23:53.011657 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_HEADER pgs=2 cs=1 l=1).process begin MSG -61> 2017-06-23 18:23:53.011660 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_HEADER pgs=2 cs=1 l=1).process got MSG header -60> 2017-06-23 18:23:53.011663 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_HEADER pgs=2 cs=1 l=1).process got envelope type=50 src client.94098 front=62 data=0 off 0 -59> 2017-06-23 18:23:53.011667 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_THROTTLE_MESSAGE pgs=2 cs=1 l=1).process prev state is STATE_OPEN_MESSAGE_HEADER -58> 2017-06-23 18:23:53.011670 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_THROTTLE_BYTES pgs=2 cs=1 l=1).process prev state is STATE_OPEN_MESSAGE_THROTTLE_MESSAGE -57> 2017-06-23 18:23:53.011673 7fffea3fe700 10 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_THROTTLE_BYTES pgs=2 cs=1 l=1).process wants 62 bytes from policy throttler 0/104857600 -56> 2017-06-23 18:23:53.011676 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_THROTTLE_DISPATCH_QUEUE pgs=2 cs=1 l=1).process prev state is STATE_OPEN_MESSAGE_THROTTLE_BYTES -55> 2017-06-23 18:23:53.011680 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_READ_FRONT pgs=2 cs=1 l=1).process prev state is STATE_OPEN_MESSAGE_THROTTLE_DISPATCH_QUEUE -54> 2017-06-23 18:23:53.011683 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_READ_FRONT pgs=2 cs=1 l=1).process got front 62 -53> 2017-06-23 18:23:53.011686 7fffea3fe700 10 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2 cs=1 l=1).process aborted = 0 -52> 2017-06-23 18:23:53.011689 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2 cs=1 l=1).process got 62 + 0 + 0 byte message -51> 2017-06-23 18:23:53.011695 7fffea3fe700 5 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2 cs=1 l=1). rx client.94098 seq 8 0x55555f49e900 mon_command({"prefix": "status"} v 0) v1 -50> 2017-06-23 18:23:53.011700 7fffea3fe700 20 -- 60.60.1.103:6789/0 queue 0x55555f49e900 prio 127 -49> 2017-06-23 18:23:53.011704 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).process prev state is STATE_OPEN_MESSAGE_READ_FRONT -48> 2017-06-23 18:23:53.011710 7fffdeff5700 1 -- 60.60.1.103:6789/0 <== client.94098 60.60.1.102:0/1080485091 8 ==== mon_command({"prefix": "status"} v 0) v1 ==== 62+0+0 (1411063408 0 0) 0x55555f49e900 con 0x55555f059800 -47> 2017-06-23 18:23:53.011741 7fffdeff5700 0 mon.iozone-103@0(leader) e2 handle_command mon_command({"prefix": "status"} v 0) v1 -46> 2017-06-23 18:23:53.011754 7fffdeff5700 0 log_channel(audit) log [DBG] : from='client.? 60.60.1.102:0/1080485091' entity='client.admin' cmd=[{"prefix": "status"}]: dispatch -45> 2017-06-23 18:23:53.011757 7fffdeff5700 10 log_client _send_to_monlog to self -44> 2017-06-23 18:23:53.011758 7fffdeff5700 10 log_client log_queue is 1 last_log 8 sent 7 num 1 unsent 1 sending 1 -43> 2017-06-23 18:23:53.011759 7fffdeff5700 10 log_client will send 2017-06-23 18:23:53.011756 mon.0 60.60.1.103:6789/0 8 : audit [DBG] from='client.? 60.60.1.102:0/1080485091' entity='client.admin' cmd=[{"prefix": "status"}]: dispatch -42> 2017-06-23 18:23:53.011766 7fffdeff5700 1 -- 60.60.1.103:6789/0 --> 60.60.1.103:6789/0 -- log(1 entries from seq 8 at 2017-06-23 18:23:53.011756) v1 -- 0x55555f49e6c0 con 0 -41> 2017-06-23 18:23:53.011771 7fffdeff5700 20 -- 60.60.1.103:6789/0 >> 60.60.1.103:6789/0 conn(0x55555f44f000 :-1 s=STATE_NONE pgs=0 cs=0 l=0).send_message log(1 entries from seq 8 at 2017-06-23 18:23:53.011756) v1 local -40> 2017-06-23 18:23:53.011811 7fffddbf4700 20 -- 60.60.1.103:6789/0 queue 0x55555f49e6c0 prio 196 -39> 2017-06-23 18:23:53.011829 7fffdeff5700 2 mon.iozone-103@0(leader) e2 send_reply 0x55555f0789a0 0x55555f49eb40 mon_command_ack([{"prefix": "status"}]=0 v0) v1 -38> 2017-06-23 18:23:53.011832 7fffdeff5700 1 -- 60.60.1.103:6789/0 --> 60.60.1.102:0/1080485091 -- mon_command_ack([{"prefix": "status"}]=0 v0) v1 -- 0x55555f49eb40 con 0 -37> 2017-06-23 18:23:53.011837 7fffdeff5700 15 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).send_message inline write is denied, reschedule m=0x55555f49eb40 -36> 2017-06-23 18:23:53.011840 7fffea3fe700 10 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).handle_write -35> 2017-06-23 18:23:53.011845 7fffdeff5700 10 -- 60.60.1.103:6789/0 dispatch_throttle_release 62 to dispatch throttler 62/104857600 -34> 2017-06-23 18:23:53.011844 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).prepare_send_message m mon_command_ack([{"prefix": "status"}]=0 v0) v1 -33> 2017-06-23 18:23:53.011848 7fffdeff5700 20 -- 60.60.1.103:6789/0 done calling dispatch on 0x55555f49e900 -32> 2017-06-23 18:23:53.011848 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).prepare_send_message encoding features 1152323339925389307 0x55555f49eb40 mon_command_ack([{"prefix": "status"}]=0 v0) v1 -31> 2017-06-23 18:23:53.011854 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).write_message signed m=0x55555f49eb40): sig = 0 -30> 2017-06-23 18:23:53.011852 7fffdeff5700 1 -- 60.60.1.103:6789/0 <== mon.0 60.60.1.103:6789/0 0 ==== log(1 entries from seq 8 at 2017-06-23 18:23:53.011756) v1 ==== 0+0+0 (0 0 0) 0x55555f49e6c0 con 0x55555f44f000 -29> 2017-06-23 18:23:53.011858 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).write_message sending message type=51 src mon.0 front=54 data=560 off 0 -28> 2017-06-23 18:23:53.011862 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).write_message sending 9 0x55555f49eb40 -27> 2017-06-23 18:23:53.011861 7fffdeff5700 5 mon.iozone-103@0(leader).paxos(paxos active c 1..108) is_readable = 1 - now=2017-06-23 18:23:53.011861 lease_expire=0.000000 has v0 lc 108 -26> 2017-06-23 18:23:53.011867 7fffea3fe700 10 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1)._try_send sent bytes 689 remaining bytes 0 -25> 2017-06-23 18:23:53.011871 7fffea3fe700 10 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).write_message sending 0x55555f49eb40 done. -24> 2017-06-23 18:23:53.011876 7fffea3fe700 20 dpdk ipv4::get_packet len 709 -23> 2017-06-23 18:23:53.011878 7fffea3fe700 20 dpdk ipv4::send id=36 60.60.1.103 -> 60.60.1.102 len 729 -22> 2017-06-23 18:23:53.011878 7fffdeff5700 20 -- 60.60.1.103:6789/0 done calling dispatch on 0x55555f49e6c0 -21> 2017-06-23 18:23:53.011879 7fffea3fe700 10 net === tx === proto 800 48:ea:63:5:72:5e -> 48:ea:63:5:72:31 length 743 -20> 2017-06-23 18:23:53.015879 7fffea3fe700 10 net dispatch_packet === rx === proto 800 48:ea:63:5:72:31 -> 48:ea:63:5:72:5e length 60 rss_hash 1231707456 -19> 2017-06-23 18:23:53.015884 7fffea3fe700 10 dpdk handle_received_packet get 6 packet from 60.60.1.102 -> 60.60.1.103 id=17 ip_len=40 ip_hdr_len=20 pkt_len=46 offset=0 -18> 2017-06-23 18:23:53.015886 7fffea3fe700 20 dpdk handle_received_packet learn mac 48:ea:63:5:72:31 with 3c.3c.1.66 -17> 2017-06-23 18:23:53.015888 7fffea3fe700 20 tcp 60.60.1.103:6789 -> 60.60.1.102:50627 tcb(0x55555efc36c0 fd=2 s=ESTABLISHED).input_handle_other_state tcp header seq 719263757 ack 3762778979 snd next 3762778979 unack 3762778290 rcv next 719263757 len 0 fin=1 syn=0 -16> 2017-06-23 18:23:53.015891 7fffea3fe700 20 dpdk notify fd=2 mask=2 -15> 2017-06-23 18:23:53.015892 7fffea3fe700 20 dpdk notify activing=2 listening=1 waiting_idx=0 -14> 2017-06-23 18:23:53.015893 7fffea3fe700 20 dpdk notify activing=2 listening=1 waiting_idx=0 done -13> 2017-06-23 18:23:53.015894 7fffea3fe700 20 tcp 60.60.1.103:6789 -> 60.60.1.102:50627 tcb(0x55555efc36c0 fd=2 s=ESTABLISHED).operator() window update seg_seq=719263757 seg_ack=3762778979 old window=29200 new window=7 -12> 2017-06-23 18:23:53.015896 7fffea3fe700 20 dpdk notify fd=2 mask=1 -11> 2017-06-23 18:23:53.015897 7fffea3fe700 20 dpdk notify activing=2 listening=1 waiting_idx=0 -10> 2017-06-23 18:23:53.015898 7fffea3fe700 20 dpdk notify activing=3 listening=1 waiting_idx=1 done -9> 2017-06-23 18:23:53.015898 7fffea3fe700 20 tcp 60.60.1.103:6789 -> 60.60.1.102:50627 tcb(0x55555efc36c0 fd=2 s=ESTABLISHED).input_handle_other_state fin: SYN_RECEIVED or ESTABLISHED -> CLOSE_WAIT -8> 2017-06-23 18:23:53.015901 7fffea3fe700 20 dpdk poll fd=2 mask=1 -7> 2017-06-23 18:23:53.015902 7fffea3fe700 20 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).process prev state is STATE_OPEN -6> 2017-06-23 18:23:53.015906 7fffea3fe700 1 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).read_bulk peer close file descriptor 2 -5> 2017-06-23 18:23:53.015909 7fffea3fe700 1 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).read_until read failed -4> 2017-06-23 18:23:53.015913 7fffea3fe700 1 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).process read tag failed -3> 2017-06-23 18:23:53.015916 7fffea3fe700 1 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).fault on lossy channel, failing -2> 2017-06-23 18:23:53.015918 7fffea3fe700 2 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1)._stop -1> 2017-06-23 18:23:53.015922 7fffea3fe700 10 -- 60.60.1.103:6789/0 >> 60.60.1.102:0/1080485091 conn(0x55555f059800 :6789 s=STATE_OPEN pgs=2 cs=1 l=1).discard_out_queue started 0> 2017-06-23 18:23:53.017861 7fffea3fe700 -1 /home/sda4/lhp/dpdk_ceph/ceph-12.0.3/src/msg/async/Stack.h: In function 'void Worker::release_worker()' thread 7fffea3fe700 time 2017-06-23 18:23:53.015931 /home/sda4/lhp/dpdk_ceph/ceph-12.0.3/src/msg/async/Stack.h: 253: FAILED assert(oldref > 0) ceph version 12.0.3 (f2337d1b42fa49dbb0a93e4048a42762e3dffbbf) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x128) [0x555555accea8] 2: (AsyncConnection::_stop()+0x2be) [0x555555d4110e] 3: (AsyncConnection::fault()+0x5fa) [0x555555d448ea] 4: (AsyncConnection::process()+0x1388) [0x555555d507f8] 5: (EventCenter::process_events(int)+0x352) [0x555555b7c782] 6: (()+0x62c2da) [0x555555b802da] 7: (()+0x63ad91) [0x555555b8ed91] 8: (()+0xa1bb) [0x7ffff64841bb] 9: (()+0x7df5) [0x7ffff738edf5] 10: (clone()+0x6d) [0x7ffff34781ad] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_mirror 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 20/20 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 xio 1/ 5 compressor 1/ 5 bluestore 1/ 5 bluefs 1/ 3 bdev 1/ 5 kstore 4/ 5 rocksdb 4/ 5 leveldb 4/ 5 memdb 1/ 5 kinetic 1/ 5 fuse 1/ 5 mgr 1/ 5 mgrc 20/20 dpdk 1/ 5 eventtrace -2/-2 (syslog threshold) 99/99 (stderr threshold) max_recent 10000 max_new 1000 log_file --- end dump of recent events --- 2017-06-23 18:23:53.079601 7fffee6b0700 1 -- 60.60.1.103:6789/0 --> 60.60.1.103:6789/0 -- log(last 8) v1 -- 0x55555f48e000 con 0 2017-06-23 18:23:53.079609 7fffee6b0700 20 -- 60.60.1.103:6789/0 >> 60.60.1.103:6789/0 conn(0x55555f44f000 :-1 s=STATE_NONE pgs=0 cs=0 l=0).send_message log(last 8) v1 local 2017-06-23 18:23:53.079649 7fffddbf4700 20 -- 60.60.1.103:6789/0 queue 0x55555f48e000 prio 196 2017-06-23 18:23:53.079669 7fffdeff5700 1 -- 60.60.1.103:6789/0 <== mon.0 60.60.1.103:6789/0 0 ==== log(last 8) v1 ==== 0+0+0 (0 0 0) 0x55555f48e000 con 0x55555f44f000 2017-06-23 18:23:53.079695 7fffdeff5700 20 -- 60.60.1.103:6789/0 done calling dispatch on 0x55555f48e000 *** Caught signal (Aborted) ** in thread 7fffea3fe700 thread_name:msgr-worker-0 ceph version 12.0.3 (f2337d1b42fa49dbb0a93e4048a42762e3dffbbf) 1: (()+0x8301f6) [0x555555d841f6] 2: (()+0xf130) [0x7ffff7396130] 3: (gsignal()+0x37) [0x7ffff33b75d7] 4: (abort()+0x148) [0x7ffff33b8cc8] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x2a6) [0x555555acd026] 6: (AsyncConnection::_stop()+0x2be) [0x555555d4110e] 7: (AsyncConnection::fault()+0x5fa) [0x555555d448ea] 8: (AsyncConnection::process()+0x1388) [0x555555d507f8] 9: (EventCenter::process_events(int)+0x352) [0x555555b7c782] 10: (()+0x62c2da) [0x555555b802da] 11: (()+0x63ad91) [0x555555b8ed91] 12: (()+0xa1bb) [0x7ffff64841bb] 13: (()+0x7df5) [0x7ffff738edf5] 14: (clone()+0x6d) [0x7ffff34781ad] 2017-06-23 18:23:53.542570 7fffea3fe700 -1 *** Caught signal (Aborted) ** in thread 7fffea3fe700 thread_name:msgr-worker-0 ceph version 12.0.3 (f2337d1b42fa49dbb0a93e4048a42762e3dffbbf) 1: (()+0x8301f6) [0x555555d841f6] 2: (()+0xf130) [0x7ffff7396130] 3: (gsignal()+0x37) [0x7ffff33b75d7] 4: (abort()+0x148) [0x7ffff33b8cc8] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x2a6) [0x555555acd026] 6: (AsyncConnection::_stop()+0x2be) [0x555555d4110e] 7: (AsyncConnection::fault()+0x5fa) [0x555555d448ea] 8: (AsyncConnection::process()+0x1388) [0x555555d507f8] 9: (EventCenter::process_events(int)+0x352) [0x555555b7c782] 10: (()+0x62c2da) [0x555555b802da] 11: (()+0x63ad91) [0x555555b8ed91] 12: (()+0xa1bb) [0x7ffff64841bb] 13: (()+0x7df5) [0x7ffff738edf5] 14: (clone()+0x6d) [0x7ffff34781ad] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -12> 2017-06-23 18:23:53.062021 7fffe53fa700 5 mon.iozone-103@0(leader).paxos(paxos active c 1..108) queue_pending_finisher 0x55555ed32510 -11> 2017-06-23 18:23:53.079581 7fffee6b0700 4 mgrc handle_mgr_map Got map version 1 -10> 2017-06-23 18:23:53.079585 7fffee6b0700 4 mgrc handle_mgr_map Active mgr is now - -9> 2017-06-23 18:23:53.079587 7fffee6b0700 4 mgrc reconnect No active mgr available yet -8> 2017-06-23 18:23:53.079597 7fffee6b0700 2 mon.iozone-103@0(leader) e2 send_reply 0x55555f0789a0 0x55555f48e000 log(last 8) v1 -7> 2017-06-23 18:23:53.079601 7fffee6b0700 1 -- 60.60.1.103:6789/0 --> 60.60.1.103:6789/0 -- log(last 8) v1 -- 0x55555f48e000 con 0 -6> 2017-06-23 18:23:53.079609 7fffee6b0700 20 -- 60.60.1.103:6789/0 >> 60.60.1.103:6789/0 conn(0x55555f44f000 :-1 s=STATE_NONE pgs=0 cs=0 l=0).send_message log(last 8) v1 local -5> 2017-06-23 18:23:53.079649 7fffddbf4700 20 -- 60.60.1.103:6789/0 queue 0x55555f48e000 prio 196 -4> 2017-06-23 18:23:53.079669 7fffdeff5700 1 -- 60.60.1.103:6789/0 <== mon.0 60.60.1.103:6789/0 0 ==== log(last 8) v1 ==== 0+0+0 (0 0 0) 0x55555f48e000 con 0x55555f44f000 -3> 2017-06-23 18:23:53.079682 7fffdeff5700 10 log_client handle_log_ack log(last 8) v1 -2> 2017-06-23 18:23:53.079685 7fffdeff5700 10 log_client logged 2017-06-23 18:23:53.011756 mon.0 60.60.1.103:6789/0 8 : audit [DBG] from='client.? 60.60.1.102:0/1080485091' entity='client.admin' cmd=[{"prefix": "status"}]: dispatch -1> 2017-06-23 18:23:53.079695 7fffdeff5700 20 -- 60.60.1.103:6789/0 done calling dispatch on 0x55555f48e000 0> 2017-06-23 18:23:53.542570 7fffea3fe700 -1 *** Caught signal (Aborted) ** in thread 7fffea3fe700 thread_name:msgr-worker-0 ceph version 12.0.3 (f2337d1b42fa49dbb0a93e4048a42762e3dffbbf) 1: (()+0x8301f6) [0x555555d841f6] 2: (()+0xf130) [0x7ffff7396130] 3: (gsignal()+0x37) [0x7ffff33b75d7] 4: (abort()+0x148) [0x7ffff33b8cc8] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x2a6) [0x555555acd026] 6: (AsyncConnection::_stop()+0x2be) [0x555555d4110e] 7: (AsyncConnection::fault()+0x5fa) [0x555555d448ea] 8: (AsyncConnection::process()+0x1388) [0x555555d507f8] 9: (EventCenter::process_events(int)+0x352) [0x555555b7c782] 10: (()+0x62c2da) [0x555555b802da] 11: (()+0x63ad91) [0x555555b8ed91] 12: (()+0xa1bb) [0x7ffff64841bb] 13: (()+0x7df5) [0x7ffff738edf5] 14: (clone()+0x6d) [0x7ffff34781ad] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_mirror 0/ 5 rbd_replay 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 20/20 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 xio 1/ 5 compressor 1/ 5 bluestore 1/ 5 bluefs 1/ 3 bdev 1/ 5 kstore 4/ 5 rocksdb 4/ 5 leveldb 4/ 5 memdb 1/ 5 kinetic 1/ 5 fuse 1/ 5 mgr 1/ 5 mgrc 20/20 dpdk 1/ 5 eventtrace -2/-2 (syslog threshold) 99/99 (stderr threshold) max_recent 10000 max_new 1000 log_file --- end dump of recent events --- Aborted
Here is my configuration for dpdk
ms_async_op_threads=2 ms_dpdk_coremask = 0xF ms_dpdk_host_ipv4_addr = 60.60.1.103 ms_dpdk_netmask_ipv4_addr = 255.255.0.0 ms_dpdk_gateway_ipv4_addr = 60.60.1.0 ms_cluster_type = async+dpdk ms_public_type = async+dpdk public_addr = 60.60.1.103 cluster_addr = 60.60.1.103 ms_dpdk_port_id = 0 ms_dpdk_hugepages=/mnt/huge ms_dpdk_rx_buffer_count_per_core = 2048 ms_dpdk_memory_channel = 2
ceph's version info:
[root@iozone-103 ~]# ceph -v
ceph version 12.0.3 (f2337d1b42fa49dbb0a93e4048a42762e3dffbbf)
Actions