Project

General

Profile

Actions

Bug #51637

closed

mgr/insights: mgr consumes excessive amounts of memory

Added by Thore K almost 3 years ago. Updated almost 3 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
Category:
insights module
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Description of problem

After replacing a failed osd, and the ongoing replication the degraded pgs (roughly 4tb, ongoging for about 2 days), there have been a couple of oomkills and the mgr continues to consume lots of memory (15gb, increases rather quicky).

Furthermore the logfile is growing at similar pace since it keeps logging the all placement groups and an error that it can't dump the pg state into mgr/insights/health_history/2021-07-11_19

mgr set_store mon returned -27: error: entry size limited to 65536 bytes. Use 'mon config key max entry size' to manually adjust"

The thing it tries to dump is a 22k line json (when formatted), which contains a history of all placement groups since some point in time (i couldn't pin that down, but the same message is included multiple times, see attached json).

Environment

  • ceph version string: ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable)
  • Platform (OS/distro/release): CentOS Linux release 8.4.2105
  • Cluster details (nodes, monitors, OSDs): 6 nodes, 5 mons, 42 osds

How reproducible

Seems to be ongoing. Restarting the mgr does not fix it (see bumps in the graph, these were me restarting the mgr).


Files

ceph-mgr-memgrowth.png (30.2 KB) ceph-mgr-memgrowth.png mgr memory growth Thore K, 07/12/2021 04:47 PM
ceph_mgrdump.json.gz (258 KB) ceph_mgrdump.json.gz JSOM the mgr desperately tries to store in the mon Thore K, 07/12/2021 04:51 PM

Related issues 1 (0 open1 closed)

Is duplicate of mgr - Bug #48269: insights module can generate too much data, fail to put in config-keyResolvedBrad Hubbard

Actions
Actions #1

Updated by Ernesto Puerta almost 3 years ago

  • Subject changed from mgr/dashboard: mgr consumes excessive amounts of memory to mgr/insights: mgr consumes excessive amounts of memory
  • Category changed from ceph-mgr to insights module
Actions #2

Updated by Thore K almost 3 years ago

I've been able to reproduce this through the following operations:

systemctl stop ceph-osd@10
ceph osd out 10
ceph osd destroy 10 --yes-i-really-mean-it
ceph-volume lvm zap /dev/sda --destroy
ceph-volume lvm create --osd-id 10 --bluestore --crush-device-class=hdd --dmcrypt --data /dev/sda --block.db /dev/sdh4

And again, while the backfilling takes place the mgr memory consumption grows rapidly.

Actions #3

Updated by Brad Hubbard almost 3 years ago

  • Assignee set to Brad Hubbard
Actions #4

Updated by Brad Hubbard almost 3 years ago

  • Is duplicate of Bug #48269: insights module can generate too much data, fail to put in config-key added
Actions #5

Updated by Konstantin Shalygin almost 3 years ago

  • Status changed from New to Fix Under Review
  • Affected Versions v14.2.22 added
  • Affected Versions deleted (v14.2.23)
Actions #6

Updated by Konstantin Shalygin almost 3 years ago

  • Status changed from Fix Under Review to Duplicate
Actions

Also available in: Atom PDF