Project

General

Profile

Bug #22050

ERROR type entries of pglog do not update min_last_complete_ondisk, potentially ballooning memory usage

Added by mingxin liu almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Performance/Resource Usage
Target version:
-
Start date:
11/06/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:

Description

we use rbd discard api to zero the whole range of a very big volume. many extents of this volume yet to be written, before discard operation, so these extents map to those nonexistent object, when osd execute DELETE op initiated by rbd_discard for those objects will got an -ENOENT. for dup op detection sake, it record an ERROR log by a particular io path(see PrimaryLogPG::submit_log_entries) which didnot update last_complete_ondisk. In normal pglog update path, slave will update its last_complete_ondisk(ReplicatedBackend::sub_op_commit) and inform primary(ReplicatedBackend::sub_op_modify_reply), then primary use min_last_complete_ondisk as lowwer boundary to trim pglog, restrain its number. so, if a PG continuously receive this kind of DELETE op, with no successfull write occur meanwhile, primary has no change to update min_last_complete_ondisk to trim pglog, these ERROR type entries keep accumulating.


Related issues

Related to rgw - Bug #22963: radosgw-admin usage show loops indefinitly - again Resolved 02/09/2018
Copied to RADOS - Backport #23323: luminous: ERROR type entries of pglog do not update min_last_complete_ondisk, potentially ballooning memory usage Resolved

History

#1 Updated by Greg Farnum almost 2 years ago

  • Project changed from Ceph to RADOS
  • Subject changed from ERROR type entries of pglog cannot be trimmed timely caused a large memory usage to ERROR type entries of pglog do not update min_last_complete_ondisk, potentially ballooning memory usage
  • Category changed from OSD to Performance/Resource Usage
  • Component(RADOS) OSD added

This one's tricky; I'm not sure we want to trim based on error entries in the general case. If a broken client submits an error op constantly, it could go through much more quickly than real ops and trimming based on that might cause issues if an OSD is rebooting at the same time...

#2 Updated by Greg Farnum almost 2 years ago

  • Status changed from New to Triaged

#3 Updated by Greg Farnum almost 2 years ago

Josh thinks we still want to trim since it's a write to disk.

#4 Updated by Josh Durgin over 1 year ago

  • Priority changed from Normal to Urgent

This seems to be biting rgw's usage pools when rgw-admin usage trim occurs in pgs with little other activity.

#5 Updated by Orit Wasserman over 1 year ago

  • Related to Bug #22963: radosgw-admin usage show loops indefinitly - again added

#7 Updated by Josh Durgin over 1 year ago

  • Status changed from Triaged to Need Review
  • Backport set to luminous

https://github.com/ceph/ceph/pull/20827

Backport only needed to luminous since error pg log entries did not exist before that.

#8 Updated by Josh Durgin over 1 year ago

  • Copied to Backport #23323: luminous: ERROR type entries of pglog do not update min_last_complete_ondisk, potentially ballooning memory usage added

#9 Updated by Josh Durgin over 1 year ago

  • Status changed from Need Review to Pending Backport

#10 Updated by Josh Durgin over 1 year ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF