Actions
Bug #41200
closedosd: fix ceph_assert(mem_avail >= 0) caused by the unset cgroup memory limit
% Done:
0%
Source:
Community (user)
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
if cgroup memory.limit_in_bytes is unset, it's default value is
cat /sys/fs/cgroup/memory/memory.limit_in_bytes 9223372036854775807
then, osd_memory_target will be set to a large value(9223372036854775807 * osd_memory_target_cgroup_limit_ratio), finally, will cause the assert below
/root/rpmbuild/BUILD/ceph-15.0.0-3665-g2db4960/src/common/PriorityCache.cc: 299: FAILED ceph_assert(mem_avail >= 0) ceph version 15.0.0-3665-g2db4960 (2db496017ae711c1c4e474cf949482a7a0ad9034) octopus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7f4aae8bc565] 2: (()+0x4f172d) [0x7f4aae8bc72d] 3: (PriorityCache::Manager::balance()+0x457) [0x7f4aaf2e8597] 4: (BlueStore::MempoolThread::entry()+0x511) [0x7f4aaedd91a1] 5: (()+0x7e25) [0x7f4aac013e25] 6: (clone()+0x6d) [0x7f4aaaee035d] 2019-08-12T10:11:34.073+0800 7f4a9ca05700 -1 *** Caught signal (Aborted) ** in thread 7f4a9ca05700 thread_name:bstore_mempool ceph version 15.0.0-3665-g2db4960 (2db496017ae711c1c4e474cf949482a7a0ad9034) octopus (dev) 1: (()+0xf5e0) [0x7f4aac01b5e0] 2: (gsignal()+0x37) [0x7f4aaae1d1f7] 3: (abort()+0x148) [0x7f4aaae1e8e8] 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x7f4aae8bc5b4] 5: (()+0x4f172d) [0x7f4aae8bc72d] 6: (PriorityCache::Manager::balance()+0x457) [0x7f4aaf2e8597] 7: (BlueStore::MempoolThread::entry()+0x511) [0x7f4aaedd91a1] 8: (()+0x7e25) [0x7f4aac013e25] 9: (clone()+0x6d) [0x7f4aaaee035d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions