Project

General

Profile

Actions

Feature #59714

open

mgr/volumes: Support to reject CephFS clones if cloner threads are not available

Added by Neeraj Pratap Singh 11 months ago. Updated about 1 month ago.

Status:
Pending Backport
Priority:
Normal
Category:
Administration/Usability
Target version:
% Done:

0%

Source:
Tags:
backport_processed
Backport:
reef,quincy,squid
Reviewed:
Affected Versions:
Component(FS):
mgr/volumes
Labels (FS):
Pull request ID:

Description

1. CephFS clone creation have a limit of 4 parallel clones at a time and rest
of the clone create requests are queued. This makes CephFS cloning very slow when there is large amount of clones being created.

2. CephCSI/Kubernetes storage does have a mechanism to delete in-progress clones and deletion of corresponding kubernetes object pvc may lead to stale resource.

Due to the above reasons, there are a lot of customer cases with stale cephfs clones.
For detailed discussion: https://bugzilla.redhat.com/show_bug.cgi?id=2196829


Related issues 4 (2 open2 closed)

Related to CephFS - Fix #62712: pybind/mgr/volumes: implement EAGAIN logic for clearing request queue when under loadNewNeeraj Pratap Singh

Actions
Copied to CephFS - Backport #64517: quincy: mgr/volumes: Support to reject CephFS clones if cloner threads are not availableIn ProgressNeeraj Pratap SinghActions
Copied to CephFS - Backport #64518: reef: mgr/volumes: Support to reject CephFS clones if cloner threads are not availableResolvedNeeraj Pratap SinghActions
Copied to CephFS - Backport #64701: squid: mgr/volumes: Support to reject CephFS clones if cloner threads are not availableResolvedNeeraj Pratap SinghActions
Actions #1

Updated by Neeraj Pratap Singh 11 months ago

  • Assignee set to Neeraj Pratap Singh
Actions #2

Updated by Venky Shankar 11 months ago

  • Category set to Administration/Usability
  • Target version set to v19.0.0
  • Backport set to reef,quincy
Actions #3

Updated by Neeraj Pratap Singh 9 months ago

I am thinking to move ahead with this approach: Allow the cloning only when (pending_clones + in-progress_clones) <= max_concurrent_clones, otherwise return the error EAGAIN.@vshankar, @Kotresh Hiremath Ravishankar?

Actions #4

Updated by Kotresh Hiremath Ravishankar 9 months ago

Neeraj Pratap Singh wrote:

I am thinking to move ahead with this approach: Allow the cloning only when (pending_clones + in-progress_clones) <= max_concurrent_clones, otherwise return the error EAGAIN.@vshankar, @Kotresh Hiremath Ravishankar?

The approach looks correct. I think the condition should be (pending_clones + in-progress-clones) < max_concurrent-clones.

Actions #5

Updated by Neeraj Pratap Singh 9 months ago

Kotresh Hiremath Ravishankar wrote:

Neeraj Pratap Singh wrote:

I am thinking to move ahead with this approach: Allow the cloning only when (pending_clones + in-progress_clones) <= max_concurrent_clones, otherwise return the error EAGAIN.@vshankar, @Kotresh Hiremath Ravishankar?

The approach looks correct. I think the condition should be (pending_clones + in-progress-clones) < max_concurrent-clones.

Right, it should only be less than. thanks!

Actions #6

Updated by Neeraj Pratap Singh 9 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 52670
Actions #7

Updated by Neeraj Pratap Singh 8 months ago

@Venky Shankar @Kotresh Hiremath Ravishankar Since, I was on sick leave yesterday. I saw the discussion made on the PR today. Seeing the final comment: https://github.com/ceph/ceph/pull/52670#issuecomment-1685809411 , we are moving ahead with the current way of having a config option but now by default this feature will be enabled will be the change expected.Am I right?

Actions #8

Updated by Venky Shankar 8 months ago

Neeraj Pratap Singh wrote:

@Venky Shankar @Kotresh Hiremath Ravishankar Since, I was on sick leave yesterday. I saw the discussion made on the PR today. Seeing the final comment: https://github.com/ceph/ceph/pull/52670#issuecomment-1685809411 , we are moving ahead with the current way of having a config option but now by default this feature will be enabled will be the change expected.Am I right?

Right.

Actions #9

Updated by Patrick Donnelly 7 months ago

  • Related to Fix #62712: pybind/mgr/volumes: implement EAGAIN logic for clearing request queue when under load added
Actions #10

Updated by Venky Shankar about 2 months ago

Backport note: additional include commits from https://github.com/ceph/ceph/pull/55660

Actions #11

Updated by Neeraj Pratap Singh about 2 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #12

Updated by Backport Bot about 2 months ago

  • Copied to Backport #64517: quincy: mgr/volumes: Support to reject CephFS clones if cloner threads are not available added
Actions #13

Updated by Backport Bot about 2 months ago

  • Copied to Backport #64518: reef: mgr/volumes: Support to reject CephFS clones if cloner threads are not available added
Actions #14

Updated by Backport Bot about 2 months ago

  • Tags set to backport_processed
Actions #15

Updated by Venky Shankar about 1 month ago

Backport note: required additional commits from https://github.com/ceph/ceph/pull/55930

Actions #16

Updated by Venky Shankar about 1 month ago

  • Tags deleted (backport_processed)
  • Backport changed from reef,quincy to reef,quincy,squid
Actions #17

Updated by Backport Bot about 1 month ago

  • Copied to Backport #64701: squid: mgr/volumes: Support to reject CephFS clones if cloner threads are not available added
Actions #18

Updated by Backport Bot about 1 month ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF