Project

General

Profile

Actions

Bug #41204

open

CephFS pool usage 3x above expected value and sparse journal dumps

Added by Janek Bevendorff over 4 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Performance/Resource Usage
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I am in the process of copying about 230 million small and medium-sized files to a CephFS and I have three active MDSs to keep up with the constant create workload induced by the copy process. Previously, I was struggling heavily with runaway MDS cache grow, which was fixable by increasing the cache trim size (see issues #41140 and #41141).

Unfortunately, another problem has emerged after a few days of copying data. ceph df detail reports:

    POOL                             ID      STORED      OBJECTS     USED        %USED     MAX AVAIL     QUOTA OBJECTS     QUOTA BYTES     DIRTY       USED COMPR     UNDER COMPR
    <...>
    cephfs.storage.data              108      44 TiB     176.13M     149 TiB      1.63       5.9 PiB     N/A               N/A             176.13M            0 B             0 B 
    cephfs.storage.meta              109     174 GiB      16.44M     178 GiB         0       2.9 PiB     N/A               N/A              16.44M            0 B             0 B

44 TiB of stored data looks about right, but 149 TiB actual pool usage is way beyond anything I would expect. The data pool is an EC pool with k=6, m=3 (i.e. without overhead, I would expect 66 TiB overall allocation). The metadata pool is also huge with 178 GiB (the raw uncompressed file list in plaintext format is 23 GiB).

A CephFS journal dump prints the following warnings:

2019-08-12 14:27:56.881 7fd9b5587700 -1 NetHandler create_socket couldn't create socket (97) Address family not supported by protocol
journal is 4444529514668~391241069
wrote 391241069 bytes at offset 4444529514668 to /var/lib/ceph/journal.bin.0
NOTE: this is a _sparse_ file; you can
        $ tar cSzf /var/lib/ceph/journal.bin.0.tgz /var/lib/ceph/journal.bin.0
      to efficiently compress it while preserving sparseness.
2019-08-12 14:28:09.709 7fd9b4d86700 -1 NetHandler create_socket couldn't create socket (97) Address family not supported by protocol
journal is 2887216998866~485120245
wrote 485120245 bytes at offset 2887216998866 to /var/lib/ceph/journal.bin.1
NOTE: this is a _sparse_ file; you can
        $ tar cSzf /var/lib/ceph/journal.bin.1.tgz /var/lib/ceph/journal.bin.1
      to efficiently compress it while preserving sparseness.
2019-08-12 14:28:43.241 7fd9b5d88700 -1 NetHandler create_socket couldn't create socket (97) Address family not supported by protocol
journal is 2839942271068~2529161124
wrote 2529161124 bytes at offset 2839942271068 to /var/lib/ceph/journal.bin.2
NOTE: this is a _sparse_ file; you can
        $ tar cSzf /var/lib/ceph/journal.bin.2.tgz /var/lib/ceph/journal.bin.2
      to efficiently compress it while preserving sparseness.

and then creates three sparse files of 4.1T, 2.7T, and 2.6T, respectively (actual sizes: 374M, 463M, 2.4G).

A discussion on IRC revealed that at least one other user has been struggling with this issue, which in their case resulted in a total loss of their FS requiring a full recovery from the data pool.

Actions

Also available in: Atom PDF