Bug #41200
osd: fix ceph_assert(mem_avail >= 0) caused by the unset cgroup memory limit
% Done:
0%
Source:
Community (user)
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
if cgroup memory.limit_in_bytes is unset, it's default value is
cat /sys/fs/cgroup/memory/memory.limit_in_bytes 9223372036854775807
then, osd_memory_target will be set to a large value(9223372036854775807 * osd_memory_target_cgroup_limit_ratio), finally, will cause the assert below
/root/rpmbuild/BUILD/ceph-15.0.0-3665-g2db4960/src/common/PriorityCache.cc: 299: FAILED ceph_assert(mem_avail >= 0) ceph version 15.0.0-3665-g2db4960 (2db496017ae711c1c4e474cf949482a7a0ad9034) octopus (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7f4aae8bc565] 2: (()+0x4f172d) [0x7f4aae8bc72d] 3: (PriorityCache::Manager::balance()+0x457) [0x7f4aaf2e8597] 4: (BlueStore::MempoolThread::entry()+0x511) [0x7f4aaedd91a1] 5: (()+0x7e25) [0x7f4aac013e25] 6: (clone()+0x6d) [0x7f4aaaee035d] 2019-08-12T10:11:34.073+0800 7f4a9ca05700 -1 *** Caught signal (Aborted) ** in thread 7f4a9ca05700 thread_name:bstore_mempool ceph version 15.0.0-3665-g2db4960 (2db496017ae711c1c4e474cf949482a7a0ad9034) octopus (dev) 1: (()+0xf5e0) [0x7f4aac01b5e0] 2: (gsignal()+0x37) [0x7f4aaae1d1f7] 3: (abort()+0x148) [0x7f4aaae1e8e8] 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x7f4aae8bc5b4] 5: (()+0x4f172d) [0x7f4aae8bc72d] 6: (PriorityCache::Manager::balance()+0x457) [0x7f4aaf2e8597] 7: (BlueStore::MempoolThread::entry()+0x511) [0x7f4aaedd91a1] 8: (()+0x7e25) [0x7f4aac013e25] 9: (clone()+0x6d) [0x7f4aaaee035d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Related issues
History
#1 Updated by Patrick Donnelly over 4 years ago
- Project changed from Ceph to RADOS
- Category deleted (
OSD) - Priority changed from Normal to High
- Start date deleted (
08/12/2019) - Source set to Community (user)
- Backport set to nautilus
- Affected Versions v14.2.2 added
- Component(RADOS) OSD added
#2 Updated by Patrick Donnelly over 4 years ago
- Tracker changed from Fix to Bug
- Regression set to No
- Severity set to 3 - minor
#3 Updated by Josh Durgin over 4 years ago
- Status changed from New to Pending Backport
- Pull request ID changed from 29599 to 29581
#4 Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41455: nautilus: osd: fix ceph_assert(mem_avail >= 0) caused by the unset cgroup memory limit added
#5 Updated by Nathan Cutler over 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved".
#6 Updated by Nathan Cutler over 4 years ago
- Duplicated by Bug #41215: os/bluestore: do not set osd_memory_target default from cgroup limit added