Project

General

Profile

Actions

Bug #50035

closed

cephfs-mirror: use sensible mount/shutdown timeouts

Added by Venky Shankar about 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Administration/Usability
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The mirror daemon just relies on the defaults which are pretty high:

    Option("client_mount_timeout", Option::TYPE_FLOAT, Option::LEVEL_ADVANCED)
    .set_default(300.0)
    .set_description("timeout for mounting CephFS (seconds)"),

    Option("client_shutdown_timeout", Option::TYPE_SECS, Option::LEVEL_ADVANCED)
    .set_flag(Option::FLAG_RUNTIME)
    .set_default(30)
    .set_min(0)
    .set_description("timeout for shutting down CephFS")
    .set_long_description("Timeout for shutting down CephFS via unmount or shutdown.")
    .add_tag("client")

Especially `client_mount_timeout`. When a (remote) filesystem is not reachable (for various reason), this can stall the updater timer thread for 300 seconds! `cephfs-mirror` should define (in its config) and use sensible defaults. It's ok to fail mounting a remote file system since the mirror daemon will retry connecting periodically.


Related issues 2 (0 open2 closed)

Related to CephFS - Bug #50224: qa: test_mirroring_init_failure_with_recovery failureResolvedVenky Shankar

Actions
Copied to CephFS - Backport #50871: pacific: cephfs-mirror: use sensible mount/shutdown timeoutsResolvedVenky ShankarActions
Actions #1

Updated by Patrick Donnelly about 3 years ago

  • Status changed from New to Triaged
  • Assignee set to Venky Shankar
Actions #2

Updated by Venky Shankar about 3 years ago

Mostly setting this as a config option (in ceph.conf) would suffice. However, cephfs-mirror can connect to the remote cluster with monitor addresss and does not require remote cluster ceph config file on the primary cluster. How would we want the apply remote cluster config in this case?

Actions #3

Updated by Patrick Donnelly about 3 years ago

Venky Shankar wrote:

Mostly setting this as a config option (in ceph.conf) would suffice. However, cephfs-mirror can connect to the remote cluster with monitor addresss and does not require remote cluster ceph config file on the primary cluster. How would we want the apply remote cluster config in this case?

cephfs-mirror can update the defaults in its cct->_conf. It woudl do this for both CephContexts it holds for each cluster, right?

Actions #4

Updated by Venky Shankar about 3 years ago

Patrick Donnelly wrote:

Venky Shankar wrote:

Mostly setting this as a config option (in ceph.conf) would suffice. However, cephfs-mirror can connect to the remote cluster with monitor addresss and does not require remote cluster ceph config file on the primary cluster. How would we want the apply remote cluster config in this case?

cephfs-mirror can update the defaults in its cct->_conf. It woudl do this for both CephContexts it holds for each cluster, right?

Right -- that's the reason to introduce a config option to override the override :/

Actions #5

Updated by Venky Shankar about 3 years ago

  • Status changed from Triaged to In Progress
Actions #6

Updated by Venky Shankar about 3 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 40885
Actions #7

Updated by Venky Shankar almost 3 years ago

  • Related to Bug #50224: qa: test_mirroring_init_failure_with_recovery failure added
Actions #8

Updated by Patrick Donnelly almost 3 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #9

Updated by Backport Bot almost 3 years ago

  • Copied to Backport #50871: pacific: cephfs-mirror: use sensible mount/shutdown timeouts added
Actions #10

Updated by Loïc Dachary almost 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF