Project

General

Profile

Actions

Bug #47682

closed

MDS can't release caps faster than clients taking caps

Added by Dan van der Ster over 3 years ago. Updated over 3 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We have a workload in which a kernel client is stat'ing all files in an FS. This workload triggered a few issues:

1. The mds memory usage grew well above the 4GB mds_cache_memory_limit, to 16GB. One client had 1M caps at that point, and the other clients had several hundred K each:

2020-09-29 09:00:22.993 7f26c8054700  2 mds.0.cache Memory usage:  total 17912352, rss 17019028, heap 332036, baseline 332036, 3185001 / 3273427 inodes have caps, 3185107 caps, 0.973019 caps per inode

2. We increased the mds_cache_memory_limit to 16GB to silence the "MDS cache oversized" warning.
3. The clients did `echo 2 > /proc/sys/vm/drop_caches`, after which the num caps dropped back to a reasonable low number.
2020-09-29 09:44:33.556 7f26c8054700  2 mds.0.cache Memory usage:  total 19835760, rss 18776188, heap 332036, baseline 332036, 59011 / 4527518 inodes have caps, 59022 caps, 0.0130363 caps per inode

4. We decreased the mds_cache_memory_limit back to 8GB, expecting the memory usage to decrease. Cached inodes decreased, but RSS did not:

2020-09-29 09:46:23.953 7f26c8054700  2 mds.0.cache Memory usage:  total 19835760, rss 17850188, heap 332036, baseline 332036, 48453 / 2150423 inodes have caps, 48464 caps, 0.022537 caps per inode

5. We found that the memory is all stuck in the tcmalloc central cache freelist:
mds.cephflash20-01cbf24286 tcmalloc heap stats:------------------------------------------------
MALLOC:     8937379976 ( 8523.3 MiB) Bytes in use by application
MALLOC: +        32768 (    0.0 MiB) Bytes in page heap freelist
MALLOC: +   9169491864 ( 8744.7 MiB) Bytes in central cache freelist
MALLOC: +       146176 (    0.1 MiB) Bytes in transfer cache freelist
MALLOC: +     33780960 (   32.2 MiB) Bytes in thread cache freelists
MALLOC: +    117702656 (  112.2 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =  18258534400 (17412.7 MiB) Actual memory used (physical + swap)
MALLOC: +   1563828224 ( 1491.4 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =  19822362624 (18904.1 MiB) Virtual address space used
MALLOC:
MALLOC:        1864499              Spans in use
MALLOC:             22              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.

`heap release` does not free up that memory.

How do we release the memory in the `central cache freelist`? Generally, is there something we can do to avoid this ?


Related issues 1 (0 open1 closed)

Is duplicate of CephFS - Bug #47307: mds: throttle workloads which acquire caps faster than the client can releaseResolvedKotresh Hiremath Ravishankar

Actions
Actions

Also available in: Atom PDF