Project

General

Profile

Actions

Bug #19198

closed

Bluestore doubles mem usage when caching object content

Added by Igor Fedotov about 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Mohamad Gebai
Category:
Performance/Resource Usage
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
BlueStore
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When trying to cache object content BlueStore uses twice as much memory than it really caches.

The root cause for that seems to be in buffer::create_page_aligned implementation. Actually it results in
new raw_posix_aligned() -> mempool::buffer_data::alloc_char.allocate_aligned(len, align) -> posix_memalign((void**)(void*)&ptr, align, total);
sequence that in fact does 2 allocations:
1) for raw_posix_aligned
2) for data itself.

It looks like this sequence causes 2 * 4096 bytes allocation instead of 4096 + small_delta. See a code snipped attached.
The additional trick is that mempool stuff is unable to estimate an overhead and hence BlueStore cache cleanup doesn't work properly.

The issue is reproducible under Ubuntu 16.04 for both jemalloc and tcmalloc builds.

Output for 16Gb allocation:

Mem before: VmRSS: 45232 kB
Mem after: VmRSS: 33599524 kB
Mem actually used: 33554292 kB
Mem pool reports: 16777216 kB
Mem before2: VmRSS: 2161412 kB
Mem after2: VmRSS: 33632268 kB
Mem actually used: 32226156544 bytes


Files

membug.diff (2.6 KB) membug.diff Igor Fedotov, 03/06/2017 05:04 PM
Actions #1

Updated by Sage Weil almost 7 years ago

  • Status changed from New to 12
  • Priority changed from Normal to Urgent
Actions #2

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category set to Performance/Resource Usage
  • Component(RADOS) BlueStore added
Actions #3

Updated by Mohamad Gebai almost 7 years ago

  • Assignee set to Mohamad Gebai
Actions #4

Updated by Greg Farnum almost 7 years ago

  • Priority changed from Urgent to Normal
Actions #5

Updated by Mohamad Gebai over 6 years ago

Update: the unit test in attachment does show that twice the memory is used due to page-alignment inefficiencies. However, unit tests do not use tcmalloc, contrarily to Ceph. When LD_PRELOAD'ing libtcmalloc, the unit test fails to reproduce the memory issue, which leads to believe that using tcmalloc fixes this problem. Unless there's a way to reproduce this with Ceph itself (not a unit test), it doesn't seem to be a valid bug.

Actions #6

Updated by Mohamad Gebai over 6 years ago

  • Status changed from 12 to Need More Info
Actions #7

Updated by Mohamad Gebai over 6 years ago

  • Status changed from Need More Info to Closed

I talked to Igor. It seems this is really is a non-bug, as the UTs use the glibc allocator. A follow-up will be to use tcmalloc in unit tests.

Actions

Also available in: Atom PDF