Bug #42452
closed
msg/async: the event center is blocked by rdma construct conection for transport ib sync msg
Added by Peng Liu over 4 years ago.
Updated over 3 years ago.
ceph-qa-suite:
ceph-deploy
Component(RADOS):
Messenger
Description
In msg/async/rdma, We construct a tcp connection to transport ib sync msg, if the
remote node is shutdown (shutdown by accident), the net.connect will be blocked until timeout
is reached, which cause the event center be blocked.
This bug may cause mon probe timeout and osd not reply, and so on.
How to trigger this Bug:
1. use async+rdma;
2. reboot a server;
3. observe cluster recovery time;
4. observe whether have normal osds are mark down.
- Status changed from New to Fix Under Review
- Assignee set to Peng Liu
- Pull request ID set to 31109
- Backport set to nautilus, mimic
- Project changed from mgr to RADOS
- Component(RADOS) Messenger added
- Status changed from Fix Under Review to Pending Backport
- Copied to Backport #44369: mimic: msg/async: the event center is blocked by rdma construct conection for transport ib sync msg added
- Copied to Backport #44370: nautilus: msg/async: the event center is blocked by rdma construct conection for transport ib sync msg added
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Also available in: Atom
PDF