Bug #41204
openCephFS pool usage 3x above expected value and sparse journal dumps
0%
Description
I am in the process of copying about 230 million small and medium-sized files to a CephFS and I have three active MDSs to keep up with the constant create workload induced by the copy process. Previously, I was struggling heavily with runaway MDS cache grow, which was fixable by increasing the cache trim size (see issues #41140 and #41141).
Unfortunately, another problem has emerged after a few days of copying data. ceph df detail
reports:
POOL ID STORED OBJECTS USED %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR <...> cephfs.storage.data 108 44 TiB 176.13M 149 TiB 1.63 5.9 PiB N/A N/A 176.13M 0 B 0 B cephfs.storage.meta 109 174 GiB 16.44M 178 GiB 0 2.9 PiB N/A N/A 16.44M 0 B 0 B
44 TiB of stored data looks about right, but 149 TiB actual pool usage is way beyond anything I would expect. The data pool is an EC pool with k=6, m=3 (i.e. without overhead, I would expect 66 TiB overall allocation). The metadata pool is also huge with 178 GiB (the raw uncompressed file list in plaintext format is 23 GiB).
A CephFS journal dump prints the following warnings:
2019-08-12 14:27:56.881 7fd9b5587700 -1 NetHandler create_socket couldn't create socket (97) Address family not supported by protocol journal is 4444529514668~391241069 wrote 391241069 bytes at offset 4444529514668 to /var/lib/ceph/journal.bin.0 NOTE: this is a _sparse_ file; you can $ tar cSzf /var/lib/ceph/journal.bin.0.tgz /var/lib/ceph/journal.bin.0 to efficiently compress it while preserving sparseness. 2019-08-12 14:28:09.709 7fd9b4d86700 -1 NetHandler create_socket couldn't create socket (97) Address family not supported by protocol journal is 2887216998866~485120245 wrote 485120245 bytes at offset 2887216998866 to /var/lib/ceph/journal.bin.1 NOTE: this is a _sparse_ file; you can $ tar cSzf /var/lib/ceph/journal.bin.1.tgz /var/lib/ceph/journal.bin.1 to efficiently compress it while preserving sparseness. 2019-08-12 14:28:43.241 7fd9b5d88700 -1 NetHandler create_socket couldn't create socket (97) Address family not supported by protocol journal is 2839942271068~2529161124 wrote 2529161124 bytes at offset 2839942271068 to /var/lib/ceph/journal.bin.2 NOTE: this is a _sparse_ file; you can $ tar cSzf /var/lib/ceph/journal.bin.2.tgz /var/lib/ceph/journal.bin.2 to efficiently compress it while preserving sparseness.
and then creates three sparse files of 4.1T, 2.7T, and 2.6T, respectively (actual sizes: 374M, 463M, 2.4G).
A discussion on IRC revealed that at least one other user has been struggling with this issue, which in their case resulted in a total loss of their FS requiring a full recovery from the data pool.