Project

General

Profile

Actions

Bug #48645

open

Ceph-OSD octopus memory leak

Added by David Marthy over 3 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi everyone,

Our team operates a Ceph Cluster(ceph version 15.2.5 (2c93eff00150f0cc5f106a559557a58d3d7b6f1f) octopus (stable)) upgraded from Nautilus,Mimic with ceph-ansible.
We are running on CentOS 7 with 3.10.0-957.1.3.el7.x86_64.

Our cluster has 5 nodes with 8-14 OSDs per node. All of these OSDs are SAMSUNG MZ7LM960 1TB ssds
The nodes had 64GB of memory, before the issue.

Issue:
After whatever reason our osds started to use much more memory then what we configured in osd_memory_target, they starved the OS and got OOM. When we saw the issue 40 of our OSDs were already down.

Logs when it started to eat up all available memory:
020-12-15T09:01:34.625+0100 7f1bcfbafc00 1 bluestore(/var/lib/ceph/osd/ceph-42) _upgrade_super from 4, latest 4
2020-12-15T09:01:34.625+0100 7f1bcfbafc00 1 bluestore(/var/lib/ceph/osd/ceph-42) _upgrade_super done
2020-12-15T09:01:34.643+0100 7f1bcfbafc00 0 _get_class not permitted to load queue
2020-12-15T09:01:34.646+0100 7f1bcfbafc00 0 _get_class not permitted to load kvs
2020-12-15T09:01:34.646+0100 7f1bcfbafc00 0 <cls> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/15.2.5/rpm/el7/BUILD/ceph-15.2.5/s
rc/cls/hello/cls_hello.cc:312: loading cls_hello
2020-12-15T09:01:34.649+0100 7f1bcfbafc00 0 _get_class not permitted to load sdk
2020-12-15T09:01:34.649+0100 7f1bcfbafc00 0 _get_class not permitted to load lua
2020-12-15T09:01:34.650+0100 7f1bcfbafc00 0 <cls> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/15.2.5/rpm/el7/BUILD/ceph-15.2.5/s
rc/cls/cephfs/cls_cephfs.cc:198: loading cephfs
2020-12-15T09:01:34.650+0100 7f1bcfbafc00 0 osd.42 40659 crush map has features 288514051259236352, adjusting msgr requires for clients
2020-12-15T09:01:34.650+0100 7f1bcfbafc00 0 osd.42 40659 crush map has features 288514051259236352 was 8705, adjusting msgr requires for mons
2020-12-15T09:01:34.650+0100 7f1bcfbafc00 0 osd.42 40659 crush map has features 3314933000852226048, adjusting msgr requires for osds
2020-12-15T09:01:34.650+0100 7f1bcfbafc00 1 osd.42 40659 check_osdmap_features require_osd_release unknown -> octopus
2020-12-15T09:01:35.386+0100 7f1bcfbafc00 0 osd.42 40659 load_pgs
2020-12-15T09:01:57.826+0100 7f1bcfbafc00 0 osd.42 40659 load_pgs opened 159 pgs
2020-12-15T09:01:57.827+0100 7f1bcfbafc00 -1 osd.42 40659 log_to_monitors {default=true}
2020-12-15T09:01:57.966+0100 7f1bcfbafc00 0 osd.42 40659 done with init, starting boot process
2020-12-15T09:01:57.966+0100 7f1bcfbafc00 1 osd.42 40659 start_boot
2020-12-15T09:01:57.967+0100 7f1ba899d700 1 osd.42 pg_epoch: 40536 pg[10.2( v 40535'20830147 (40534'20828105,40535'20830147] local-lis/les=40126/40127 n=3909 ec=157/157 lis/c=40126/40126 les/c/f=40127/40127/0 sis=40536) [42,23,46]
r=0 lpr=40536 pi=[40126,40536)/1 crt=40535'20830147 lcod 0'0 mlcod 0'0 unknown mbc={}] start_peering_interval up [42,23,46] -> [42,23,46], acting [42,23,46] -> [42,23,46], acting_primary 42 -> 42, up_primary 42 -> 42, role 0 -> 0, f
eatures acting 4540138292836696063 upacting 4540138292836696063
2020-12-15T09:03:55.405+0100 7f1bc7259700 -1 received signal: Terminated from /usr/lib/systemd/systemd --switched-root --system --deserialize 22 (PID: 1) UID: 0
2020-12-15T09:03:55.405+0100 7f1bc7259700 -1 osd.42 40659 * Got signal Terminated
2020-12-15T09:03:55.405+0100 7f1bc7259700 -1 osd.42 40659
Immediate shutdown (osd_fast_shutdown=true) *

After a while we could gather some memory from our other servers.
One of our ceph nodes got 512GB memory, then we saw more then 100GB memory usage from our restarted down OSDs.

Some of our OSDs could start with the newly allocated memory to the nodes and ceph tell osd.xy heap release would solve the issue to decrease the memory usage around 10G .
But some of them are cannot start with this much.

After the first memory upgraded node we started to upgrade the other four but we could only upgrade them to 256GB each.

I've attached a photo from htop and attached a memprofile.

Status:
Now we have a cluster with HEALTH_WARN, some of our OSDs are still down. And some PGs in unknown status.

Also here is the ceph config dump:

[root@cephbd01 /]# ceph config dump
....
osd advanced bluefs_preextend_wal_files false
osd dev bluestore_cache_autotune false
osd advanced osd_backfill_scan_max 64
osd advanced osd_backfill_scan_min 8
osd advanced osd_max_pg_log_entries 3000
osd class:hdd basic osd_memory_target 2147483648
osd class:ssd basic osd_memory_target 2147483648
osd basic osd_memory_target 3221225472
osd advanced osd_memory_target_cgroup_limit_ratio 0.800000
osd advanced osd_min_pg_log_entries 3000
osd advanced osd_op_thread_suicide_timeout 900
osd.0 advanced osd_map_cache_size 20

We have tweked some of these settings like:
constantly try to turn on or off bluestore_cache_autotune
set osd_recovery_max_active_ssd=1 and osd_recovery_sleep_ssd=1 when we had only 20 OSDs not to overload our nodes

We have tried to run OSDs in docker with these docker run parameters: --memory 8G --oom-kill-disable also with --memory 50G the process stuck after reaching the memory limit.

How could we solve this memory issue?

Thanks for your help!


Files

ceph-2020-1215-512G-eaten-up-by-osd.png (456 KB) ceph-2020-1215-512G-eaten-up-by-osd.png htop-with-512GB-memory-100GBplus-mem-usage-from-osds David Marthy, 12/17/2020 07:10 AM
osd.13.profile.0099.txt (62.2 KB) osd.13.profile.0099.txt David Marthy, 12/17/2020 07:12 AM
Actions #1

Updated by Sage Weil almost 3 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (OSD)
Actions

Also available in: Atom PDF