Project

General

Profile

Bug #41200

osd: fix ceph_assert(mem_avail >= 0) caused by the unset cgroup memory limit

Added by mingshuai wang 8 days ago. Updated about 18 hours ago.

Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:

Description

if cgroup memory.limit_in_bytes is unset, it's default value is

cat /sys/fs/cgroup/memory/memory.limit_in_bytes
9223372036854775807

then, osd_memory_target will be set to a large value(9223372036854775807 * osd_memory_target_cgroup_limit_ratio), finally, will cause the assert below

/root/rpmbuild/BUILD/ceph-15.0.0-3665-g2db4960/src/common/PriorityCache.cc: 299: FAILED ceph_assert(mem_avail >= 0)

 ceph version 15.0.0-3665-g2db4960 (2db496017ae711c1c4e474cf949482a7a0ad9034) octopus (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7f4aae8bc565]
 2: (()+0x4f172d) [0x7f4aae8bc72d]
 3: (PriorityCache::Manager::balance()+0x457) [0x7f4aaf2e8597]
 4: (BlueStore::MempoolThread::entry()+0x511) [0x7f4aaedd91a1]
 5: (()+0x7e25) [0x7f4aac013e25]
 6: (clone()+0x6d) [0x7f4aaaee035d]

2019-08-12T10:11:34.073+0800 7f4a9ca05700 -1 *** Caught signal (Aborted) **
 in thread 7f4a9ca05700 thread_name:bstore_mempool

 ceph version 15.0.0-3665-g2db4960 (2db496017ae711c1c4e474cf949482a7a0ad9034) octopus (dev)
 1: (()+0xf5e0) [0x7f4aac01b5e0]
 2: (gsignal()+0x37) [0x7f4aaae1d1f7]
 3: (abort()+0x148) [0x7f4aaae1e8e8]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x7f4aae8bc5b4]
 5: (()+0x4f172d) [0x7f4aae8bc72d]
 6: (PriorityCache::Manager::balance()+0x457) [0x7f4aaf2e8597]
 7: (BlueStore::MempoolThread::entry()+0x511) [0x7f4aaedd91a1]
 8: (()+0x7e25) [0x7f4aac013e25]
 9: (clone()+0x6d) [0x7f4aaaee035d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

History

#1 Updated by Patrick Donnelly about 18 hours ago

  • Project changed from Ceph to RADOS
  • Category deleted (OSD)
  • Priority changed from Normal to High
  • Start date deleted (08/12/2019)
  • Source set to Community (user)
  • Backport set to nautilus
  • Affected Versions v14.2.2 added
  • Component(RADOS) OSD added

#2 Updated by Patrick Donnelly about 18 hours ago

  • Tracker changed from Fix to Bug
  • Regression set to No
  • Severity set to 3 - minor

Also available in: Atom PDF