Project

General

Profile

Bug #41162

checking con->peer_features in calc_target() is fundamentally racy

Added by Ilya Dryomov over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
libceph
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

I think I see the problem.

1) creating the OSD session doesn't open the TCP socket, the messenger
   opens it in try_write()

2) con->peer_features remains 0 until the TCP socket is opened and
   either CEPH_MSGR_TAG_SEQ or CEPH_MSGR_TAG_READY is received

3) calc_target() uses con->peer_features to check for RESEND_ON_SPLIT
   and resends only if it's set

What happened is there was no osd0 session and tid 199 had to
create it before submitting itself to the messenger.  However before
the messenger got to tid 199, some other request with pre-split epoch
had reached some other OSD and triggered osdmap incrementals.  While
processing the split incremental, calc_target() for tid 199 returned
NO_ACTION:

  if (unpaused || legacy_change || force_resend ||
      (split && con && CEPH_HAVE_FEATURE(con->peer_features,
                                         RESEND_ON_SPLIT)))
          ct_res = CALC_TARGET_NEED_RESEND;
  else
          ct_res = CALC_TARGET_NO_ACTION;

This isn't logged, but I'm pretty sure that split was 1, con wasn't
NULL and con->peer_features was 0.

I would have noticed this earlier, but I was mislead by osdc
output.  It shows outdated up/acting sets and that lead me to assume
that split was 0 (because if split is 1, up/acting sets are updated
_before_ con->peer_features check).  However in this case up/acting
sets remained the same for a few epochs after the split, so they are
not actually outdated wrt the split.

Checking con->peer_features in calc_target() is fundamentally racy...

https://marc.info/?l=ceph-devel&m=156526574331187&w=2

History

#1 Updated by Ilya Dryomov over 1 year ago

  • Status changed from In Progress to Fix Under Review
  • Priority changed from Normal to Urgent

[PATCH] libceph: fix PG split vs OSD (re)connect race

#3 Updated by Ilya Dryomov over 1 year ago

  • Status changed from Pending Backport to Resolved

In 4.14.141, 4.19.69 and 5.2.11.

Also available in: Atom PDF