Project

General

Profile

Fix #6570

osd: do not keep full pg log entries in memory

Added by Greg Farnum almost 6 years ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
10/16/2013
Due date:
% Done:

0%

Source:
Development
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:

Description

Right now, we keep the full pg log in memory. Each of these entries contains the name of the object involved, which means user naming patterns dramatically alter the OSD's memory usage. In #5700 we had reports of OSDs taking 1.8GB of RAM on startup with 600 PGs (which we haven't seen elsewhere), but that works out to ~1KB/log entry, which is definitely excessive.

To deal with this, we can stop storing the full pg_log_entry in memory and just keep a map or list of osd_reqid_t's. Individual log entries (or the whole thing) can be loaded on-demand when replayed ops come through or the PG has to peer.

This will reduce memory consumption and make it more predictable based on PG counts and total log sizes, at the cost of any necessary log accesses becoming more expensive (most especially, peering). Changes will need to be tested before merging for serious costs.

History

#1 Updated by Andrey Korolyov almost 6 years ago

Is there any danger for increasing peering time? It would be awesome to make this feature configurable since some people wants to not grow osd` RSS during rebalance and some wants just to reduce peering impact by any means.

#2 Updated by Corin Langosch almost 6 years ago

Thank your for taking care of this. This is really a huge problem for us.

I don't quite understand your statement "user naming patterns dramatically alter the OSD's memory usage". We only have three pools, their names are really normal like "kvm-images". All pools only contain rbd images, which are named using a uuid (36 characters), which also seems quite normal.

I have no idea how these logs are used, but why not keep the last version in memory and the rest on disk only?

#3 Updated by Patrick Donnelly 7 months ago

  • Project changed from Ceph to RADOS
  • Category deleted (OSD)
  • Component(RADOS) OSD added

Also available in: Atom PDF