Project

General

Profile

Actions

Bug #47697

closed

mon: set session_timeout when adding to session_map

Added by Ilya Dryomov over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus,octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

With msgr2, the session is added in Monitor::ms_handle_accept() which is queued by ProtocolV2 at the end of handling CLIENT_IDENT frame, before responding with SERVER_IDENT frame. session_timeout is 0 and gets set only in Monitor::ms_dispatch(), so if the session trimming code in Monitor::tick() gets to the session before the peer receives our SERVER_IDENT, handles it, sends the first message and we receive it, the session is wrongly closed.

This doesn't happen with msgr1, because there the session is added in Monitor::ms_dispatch(), upon receive of the first message (MSG_AUTH).


Related issues 2 (0 open2 closed)

Copied to RADOS - Backport #47747: octopus: mon: set session_timeout when adding to session_mapResolvedWei-Chung ChengActions
Copied to RADOS - Backport #47748: nautilus: mon: set session_timeout when adding to session_mapResolvedWei-Chung ChengActions
Actions #1

Updated by Ilya Dryomov over 3 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 37494
Actions #2

Updated by Ilya Dryomov over 3 years ago

  • Backport set to nautilus,octopus
Actions #3

Updated by Patrick Donnelly over 3 years ago

Were there upstream QA tests that failed because of this? How did you learn of this problem?

Actions #4

Updated by Ilya Dryomov over 3 years ago

Not to my knowledge. When it happens, it is mostly transparent to the user -- the peer reopens the socket and attempts to establish a new session (and there is usually more than one monitor to choose from, in case it happens repeatedly).

I stumbled upon it while testing some bits of msgr2 in the kernel client, but it isn't specific to that. The bug is in the monitor itself.

Actions #5

Updated by Kefu Chai over 3 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #47747: octopus: mon: set session_timeout when adding to session_map added
Actions #7

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #47748: nautilus: mon: set session_timeout when adding to session_map added
Actions #8

Updated by Nathan Cutler over 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF