Project

General

Profile

Bug #41833

[rbd-mirror] image status reports "down" after msgr v2 reconnect

Added by Jason Dillaman over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In a test environment, start two clusters w/ a single OSD and mirror and image. At this point, the "mirror image status" should report "up+replaying". However, after restarting the receiving cluster OSD, now the status will forever report "down+replaying". It appears that upon the second reconnect to the OSD, the entity_addr._type for the "rbd_mirroring" watcher switches from TYPE_ANY to TYPE_MSGR2 and therefore fails the comparison w/ status update process.

(gdb) print ondisk_status.origin
$3 = {name = {_type = 8 '\b', _num = 4154, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, 
    static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, addr = {
    static TYPE_DEFAULT = entity_addr_t::TYPE_MSGR2, type = 3, nonce = 3569845997, u = {sa = {sa_family = 2, 
        sa_data = "\000\000\300\250\001\003\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 0, 
        sin_addr = {s_addr = 50440384}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, 
        sin6_port = 0, sin6_flowinfo = 50440384, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, 
            __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}
(gdb) print watchers
$4 = std::set with 1 element = {[0] = {name = {_type = 8 '\b', _num = 4154, static TYPE_MON = 1, 
      static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, 
    addr = {static TYPE_DEFAULT = entity_addr_t::TYPE_MSGR2, type = 1, nonce = 3569845997, u = {sa = {sa_family = 2, 
          sa_data = "\000\000\300\250\001\003\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 0, 
          sin_addr = {s_addr = 50440384}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, 
          sin6_port = 0, sin6_flowinfo = 50440384, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, 
              __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}}

Related issues

Copied to rbd - Backport #41968: nautilus: [rbd-mirror] image status reports "down" after msgr v2 reconnect Resolved

History

#1 Updated by Jason Dillaman over 4 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 30438

#2 Updated by Jason Dillaman over 4 years ago

  • Backport changed from luminous,mimic,nautilus to nautilus

#3 Updated by Mykola Golub over 4 years ago

  • Status changed from Fix Under Review to Pending Backport

#4 Updated by Nathan Cutler over 4 years ago

  • Copied to Backport #41968: nautilus: [rbd-mirror] image status reports "down" after msgr v2 reconnect added

#5 Updated by Jason Dillaman over 4 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF