Project

General

Profile

Actions

Bug #13482

closed

OSDs can become too busy with replication OPs to handle client OPs

Added by Robert LeBlanc over 8 years ago. Updated about 7 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When a Ceph cluster is busy, it is possible that one or more OSDs become too busy handling replication OPs from other primaries that they can not service OPs from clients. Some sort of feedback mechanism probably should be implemented to help other primary OSDs understand that I/O from their clients should be throttled so that the replica OSDs can also service client I/O.

This issue can be replicated by generating a large amount of client traffic with large queue depths in a cluster an slow messages for only a few I/O will show up for 30+ seconds. A lower number of PGs can also help expose this issue.

Actions #1

Updated by Sage Weil about 7 years ago

  • Status changed from New to Rejected

this is an area we need to improve but it's an ongoing process. closing out the bug

Actions

Also available in: Atom PDF