Project

General

Profile

Actions

Bug #37388

open

rgw: memory leak with multisite sync

Added by Dieter Roels over 5 years ago. Updated over 5 years ago.

Status:
In Progress
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Ever since we started to use ceph with multisite sync (around the 12.2.2 time I think) we noticed that the rgw memory footprint keeps slowly growing. This resulted in an OOM kill every few days/weeks. We tested with more memory in the VMs but even with 32GB memory the OOM's occured, so we settled for VMs with 8GB memory, and had OOm's about once a week. Clients did not really notice because of the loadbalancers.

Recently we tested the mimic release, and noticed the leak is substantially worse in mimic. Our current test evnvironment is running 13.2.2, has no objects, so no client connections other then the multisite sync. The rgws get OOM kills about once a day on 8GB VMs. We tested with 32GB VMs and they show the same memory growth, but they last for a few days before OOM.

So, my question is, how can we test this memory leak? I did run it with valgrind once, but it throws lots of errors and seems not to be compatible with jemalloc. And it seems rgw does not keep memory statistics like the other daemons?

Probably usefull info: we run civetweb with ssl on rhel


Related issues 1 (1 open0 closed)

Related to rgw - Bug #23375: Memory leak in RGW when libcurl is configured with --with-nss and performing https requests to keystoneIn ProgressMark Kogan03/15/2018

Actions
Actions

Also available in: Atom PDF