Project

General

Profile

Bug #42452

msg/async: the event center is blocked by rdma construct conection for transport ib sync msg

Added by Peng Liu over 4 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
nautilus, mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-deploy
Component(RADOS):
Messenger
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In msg/async/rdma, We construct a tcp connection to transport ib sync msg, if the
remote node is shutdown (shutdown by accident), the net.connect will be blocked until timeout
is reached, which cause the event center be blocked.

This bug may cause mon probe timeout and osd not reply, and so on.


Related issues

Copied to RADOS - Backport #44369: mimic: msg/async: the event center is blocked by rdma construct conection for transport ib sync msg Rejected
Copied to RADOS - Backport #44370: nautilus: msg/async: the event center is blocked by rdma construct conection for transport ib sync msg Resolved

History

#1 Updated by Peng Liu over 4 years ago

How to trigger this Bug:

1. use async+rdma;
2. reboot a server;
3. observe cluster recovery time;
4. observe whether have normal osds are mark down.

#2 Updated by Kefu Chai over 4 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to Peng Liu
  • Pull request ID set to 31109

#3 Updated by Nathan Cutler over 4 years ago

  • Backport set to nautilus, mimic

#4 Updated by Kefu Chai over 4 years ago

  • Project changed from mgr to RADOS
  • Component(RADOS) Messenger added

#5 Updated by Kefu Chai about 4 years ago

  • Status changed from Fix Under Review to Pending Backport

#6 Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #44369: mimic: msg/async: the event center is blocked by rdma construct conection for transport ib sync msg added

#7 Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #44370: nautilus: msg/async: the event center is blocked by rdma construct conection for transport ib sync msg added

#8 Updated by Nathan Cutler about 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF