Project

General

Profile

Bug #22061

Bluestore: OSD killed due to high RAM usage

Added by Marcin Śliwiński over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We recently updated our cluster to 12.2.1. After that we moved OSDs on one of the nodes to Bluestore and since that we experience OOMs on that server on daily basis. We found #21417 so we disabled all scrubbing operations but it didn't help. We also installed there version with applied patch that was supposed to fix that bug, still, no change.

Dump of the mempools from OSD using currently over 3.3G of RAM included below.
Our ceph.conf also included below.

mempools_41.txt View - Mempools for one OSD (1.6 KB) Marcin Śliwiński, 11/07/2017 12:46 PM

ceph.conf View - Our ceph.conf (988 Bytes) Marcin Śliwiński, 11/07/2017 12:48 PM

History

#1 Updated by Wido den Hollander over 6 years ago

There are fixes planned for 12.2.2 which address memory issues with BlueStore. I don't know the exact links to them right now, but you might be hitting those.

#2 Updated by red ref over 6 years ago

I can confirm this.

Problem informations:
- write only workload
- erasure coding

Really looks like : http://tracker.ceph.com/issues/21417

Tests done (unsuccessful) :
- reduce cache size to ~ 128M
- update to latest master branch shaman build (from this morning)

I got OSD > 8 Go RSS mostly in 'buffer_anon' pool.

Is there a 'debug' way to identify objects in this pool ?

#3 Updated by Sage Weil over 6 years ago

  • Project changed from Ceph to bluestore

#4 Updated by Sage Weil about 6 years ago

  • Status changed from New to Resolved

fixed in 12.2.2

Also available in: Atom PDF