Project

General

Profile

Actions

Bug #20385

closed

jemalloc+Bluestore+BlueFS causes unexpected RSS Memory usage growth

Added by Igor Fedotov almost 7 years ago. Updated about 6 years ago.

Status:
Won't Fix
Priority:
Low
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When testing standalone BlueStore FIO plugin one can observe excessive RSS memory usage for corresponding FIO process. This occurs in case of write scenarios and BlueFS use only. Regular OSD probably suffers from the same issue as well.
The root cause is the lack for BlueFS::_flush_bdev_safely() call for one of BlueFS files that probably results in memory fragmentation and prevents from returning unused pages to OS. I suppose the file in question is log_writer one. In fact that's rather lack of BlueFS::flush_log() call that's absent over the code.

One can insert corresponding function call (_flush_bdev_safely()) at the end of BlueFS::_flush_range as a workaround and/or root cause's proof.


Files

aLL4.fio (456 Bytes) aLL4.fio fio job to precondition the store Igor Fedotov, 06/30/2017 03:07 PM
ceph-5.conf (4.15 KB) ceph-5.conf config file for overwrite scenario Igor Fedotov, 06/30/2017 03:07 PM
ceph-bluestore-somnath.conf (3.43 KB) ceph-bluestore-somnath.conf config for precondition Igor Fedotov, 06/30/2017 03:07 PM
w4-5.fio (391 Bytes) w4-5.fio fio job to execute overwrite benchmark Igor Fedotov, 06/30/2017 03:07 PM
Actions #1

Updated by Sage Weil almost 7 years ago

  • Status changed from New to 12
  • Priority changed from Normal to Immediate
Actions #2

Updated by Sage Weil almost 7 years ago

Igor, as far as I can tell _flush_bdev_safely is already being called from _sync_and_flush_log. Can you explain the steps you're doing to reproduce this?

Thanks!

Actions #3

Updated by Igor Fedotov almost 7 years ago

That was a standalone bluestore executed as FIO plugin. And random 4K overwrite scenario - 128 Gb total, 32K objects. I'm not completely sure that bluefs file in question is log_writer - that was just an assumption.

Updated by Igor Fedotov almost 7 years ago

FIO jobs and configs to precondition the store and execute the overwrite. Paths to be updated!
To run:
fio aLL4.fio
fio w4-5.fio

Please note FIO RSS memory for the second command.I saw >7 Gb by the end of the process

Actions #5

Updated by Sage Weil almost 7 years ago

  • Assignee deleted (Sage Weil)
Actions #6

Updated by Sage Weil almost 7 years ago

  • Priority changed from Immediate to High
Actions #7

Updated by Igor Fedotov over 6 years ago

When using Jemalloc still can observe some excessive memory usage (approx. x2 increase over configured cache limit) for random reads FIO/Bluestore scenario. BlueStore buffered read (and hence object content caching) is on.
Not reproducible for TCMalloc.
Suggest to close as JEMalloc isn't considered production-ready for Ceph at the moment.

Actions #8

Updated by Sage Weil over 6 years ago

  • Project changed from RADOS to bluestore
Actions #9

Updated by Sage Weil over 6 years ago

  • Subject changed from Bluestore+BlueFS causes unexpected RSS Memory usage growth to jemalloc+Bluestore+BlueFS causes unexpected RSS Memory usage growth
  • Priority changed from High to Low
Actions #10

Updated by Sage Weil about 6 years ago

  • Status changed from 12 to Won't Fix

we don't care about jemalloc at this point

Actions

Also available in: Atom PDF