Project

General

Profile

Backport #18657

jewel: Fix OSD network address in OSD heartbeat_check log message

Added by Vikhyat Umrao 8 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Release:
jewel

History

#1 Updated by Vikhyat Umrao 8 months ago

- In master and Kraken all looks good.
- In jewel this needs a change:

diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc
index a5c92fb..cfd6f77 100644
--- a/src/osd/OSD.cc
+++ b/src/osd/OSD.cc
@@ -4056,13 +4056,13 @@ void OSD::heartbeat_check()
     if (p->second.is_unhealthy(cutoff)) {
       if (p->second.last_rx_back == utime_t() ||
          p->second.last_rx_front == utime_t()) {
-       derr << "heartbeat_check: no reply from " << p->second.con_front->get_peer_addr().get_sockaddr()
+       derr << "heartbeat_check: no reply from " << p->second.con_front->get_peer_addr().get_sockaddr_storage()
             << " osd." << p->first << " ever on either front or back, first ping sent " 
             << p->second.first_tx << " (cutoff " << cutoff << ")" << dendl;
        // fail
        failure_queue[p->first] = p->second.last_tx;
       } else {
-       derr << "heartbeat_check: no reply from " << p->second.con_front->get_peer_addr().get_sockaddr()
+       derr << "heartbeat_check: no reply from " << p->second.con_front->get_peer_addr().get_sockaddr_storage()
             << " osd." << p->first << " since back " << p->second.last_rx_back
             << " front " << p->second.last_rx_front
             << " (cutoff " << cutoff << ")" << dendl;

#2 Updated by Vikhyat Umrao 8 months ago

- This fixes it.

2017-01-25 04:05:08.310025 7f0c9b846700 -1 osd.0 11 heartbeat_check: no reply from 192.168.200.47:6811 osd.2 since back 2017-01-25 04:04:47.999913 front 2017-01-25 04:04:47.999913 (cutoff 2017-01-25 04:04:48.309992)

2017-01-25 04:05:08.501377 7f0c82e8c700 -1 osd.0 11 heartbeat_check: no reply from 192.168.200.47:6811 osd.2 since back 2017-01-25 04:04:47.999913 front 2017-01-25 04:04:47.999913 (cutoff 2017-01-25 04:04:48.501341)

#5 Updated by Loic Dachary 7 months ago

  • Tracker changed from Bug to Backport
  • Description updated (diff)
  • Status changed from New to In Progress

.h3 original description

- Tracker1 had introduced this osd network address in the heartbeat_check log message.
- In master branch, it is working as expected as given in2 but backport jewel3 is not working as expected. It has network address in hex.

2017-01-25 00:04:16.113016 7fbe730ba700 -1 osd.1 11 heartbeat_check: no reply from 0x7fbe89c27290 osd.0 since back 2017-01-25 00:03:56.099392 front 2017-01-25 00:03:56.099392 (cutoff 2017-01-25 00:03:56.112991)
2017-01-25 00:04:17.113168 7fbe730ba700 -1 osd.1 11 heartbeat_check: no reply from 0x7fbe89c27290 osd.0 since back 2017-01-25 00:03:56.099392 front 2017-01-25 00:03:56.099392 (cutoff 2017-01-25 00:03:57.113144)

[1] http://tracker.ceph.com/issues/16337
[2] https://github.com/ceph/ceph/pull/9223
[3] https://github.com/ceph/ceph/pull/9739

#6 Updated by Loic Dachary 7 months ago

  • Description updated (diff)

#7 Updated by Ken Dreyer 6 months ago

  • Status changed from In Progress to Resolved
  • Target version set to v10.2.7

Also available in: Atom PDF