Bug #42859
openceph utils hung when executing rados.shutdown in RoCE transport
0%
Description
The transport layer is async+rdma. When executing the ceph util in command line, I can get the expected results, but the tool will be hung with big probability. The logs are as follows:
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c06f1b0 nevent=5000 time_id=1).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c070880 nevent=5000 time_id=5).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c06ab80 nevent=5000 time_id=5).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 10 -- 172.16.26.34:0/1717388727 wait: waiting for dispatch queue
2019-11-18 11:39:14.677 7fe2410e9700 10 -- 172.16.26.34:0/1717388727 wait: dispatch queue is stopped
2019-11-18 11:39:14.677 7fe2410e9700 1 -- 172.16.26.34:0/1717388727 shutdown_connections
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c06f1b0 nevent=5000 time_id=1).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c070880 nevent=5000 time_id=5).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c06ab80 nevent=5000 time_id=5).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 10 -- 172.16.26.34:0/1717388727 wait: done.
2019-11-18 11:39:14.677 7fe2410e9700 1 -- 172.16.26.34:0/1717388727 wait complete.
2019-11-18 11:39:14.677 7fe2410e9700 1 librados: shutdown
2019-11-18 11:39:14.677 7fe2410e9700 1 -- 172.16.26.34:0/1717388727 >> 172.16.26.34:0/1717388727 conn(0x7fe23c053040 msgr2=0x7fe23c053440 unknown :-1 s=STATE_NONE l=0).mark_down
2019-11-18 11:39:14.677 7fe2410e9700 20 --1- 172.16.26.34:0/1717388727 >> 172.16.26.34:0/1717388727 conn(0x7fe23c053040 0x7fe23c053440 :-1 s=NONE pgs=0 cs=0 l=0).stop
2019-11-18 11:39:14.677 7fe2410e9700 2 --1- 172.16.26.34:0/1717388727 >> 172.16.26.34:0/1717388727 conn(0x7fe23c053040 0x7fe23c053440 :-1 s=NONE pgs=0 cs=0 l=0).stop
2019-11-18 11:39:14.677 7fe2410e9700 10 --1- 172.16.26.34:0/1717388727 >> 172.16.26.34:0/1717388727 conn(0x7fe23c053040 0x7fe23c053440 :-1 s=NONE pgs=0 cs=0 l=0).discard_out_queue started
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c06f1b0 nevent=5000 time_id=1).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c06f1b0 nevent=5000 time_id=1).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c070880 nevent=5000 time_id=5).wakeup
2019-11-18 11:39:14.677 7fe1e1376700 20 Infiniband rearm_notify started.
2019-11-18 11:39:14.677 7fe1e1376700 20 Infiniband rearm_notify started.
2019-11-18 11:39:14.677 7fe1e1376700 20 RDMAStack polling bypass qp=0x7fe230013940 tx_wr=1
2019-11-18 11:39:14.677 7fe2410e9700 20 Event(0x7fe23c06ab80 nevent=5000 time_id=5).wakeup
2019-11-18 11:39:14.677 7fe2410e9700 20 RDMAStack ~RDMADispatcher destructing rdma dispatcher
2019-11-18 11:39:14.777 7fe1e1376700 20 RDMAStack polling bypass qp=0x7fe230013940 tx_wr=1
2019-11-18 11:39:14.777 7fe1e1376700 20 Infiniband rearm_notify started.
2019-11-18 11:39:14.777 7fe1e1376700 20 Infiniband rearm_notify started.
2019-11-18 11:39:14.777 7fe1e1376700 20 RDMAStack polling bypass qp=0x7fe230013940 tx_wr=1
2019-11-18 11:39:14.777 7fe1e1376700 20 RDMAStack polling bypass qp=0x7fe230013940 tx_wr=1
2019-11-18 11:39:14.777 7fe1e1376700 20 Infiniband rearm_notify started.
2019-11-18 11:39:14.777 7fe1e1376700 20 Infiniband rearm_notify started.
2019-11-18 11:39:14.777 7fe1e1376700 20 RDMAStack polling bypass qp=0x7fe230013940 tx_wr=1
2019-11-18 11:39:14.777 7fe1e1376700 20 RDMAStack polling bypass qp=0x7fe230013940 tx_wr=1
2019-11-18 11:39:14.777 7fe1e1376700 20 Infiniband rearm_notify started.
2019-11-18 11:39:14.777 7fe1e1376700 20 Infiniband rearm_notify started.
2019-11-18 11:39:14.777 7fe1e1376700 20 RDMAStack polling bypass qp=0x7fe230013940 tx_wr=1
2019-11-18 11:39:14.777 7fe1e1376700 20 RDMAStack polling bypass qp=0x7fe230013940 tx_wr=1
It seems that a bug in the shutdown logic of async rdma messenger: a inflight request has never received a response ?
Updated by yong xing over 4 years ago
When I comment out the 1265 and the 1266 line in /usr/bin/ceph (This avoids shutdown logic), this tool functions well.
1262 if name == '__main__':
1263 retval = main()
1264 # shutdown explicitly; Rados() does not
1265 if cluster_handle:
1266 run_in_thread(cluster_handle.shutdown)
1267 sys.exit(retval)
Updated by yong xing over 4 years ago
Sometimes, the command "ceph status" were hung for several seconds, and exited with the following debug logs.
2019-11-25 08:12:51.229 7f79b4b79700 20 Infiniband rearm_notify started.
2019-11-25 08:12:51.229 7f79b4b79700 20 Infiniband rearm_notify started.
2019-11-25 08:12:51.229 7f79b4b79700 20 RDMAStack polling tx completion queue got 2 responses.
2019-11-25 08:12:51.229 7f79b4b79700 1 RDMAStack handle_tx_event Work Request Flushed Error: this connection's qp=17135 should be down while this WR=140160755516176 still in flight.
2019-11-25 08:12:51.229 7f79b4b79700 1 RDMAStack handle_tx_event Work Request Flushed Error: this connection's qp=17135 should be down while this WR=140160755591296 still in flight.
2019-11-25 08:12:51.229 7f79b4b79700 1 RDMAStack handle_tx_event sending of the disconnect msg completed
2019-11-25 08:12:51.229 7f79b4b79700 10 RDMAStack polling finally delete qp=0x7f79b80c6880
2019-11-25 08:12:51.229 7f79b4b79700 20 Infiniband ~QueuePair destroy qp=0x7f79b80c6908