Project

General

Profile

Bug #44359

Raw usage reported by 'ceph osd df' incorrect when using WAL/DB on another drive

Added by Bryan Stillwell about 4 years ago. Updated almost 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While converting a nautilus (14.2.6) cluster from using FileStore to BlueStore I've noticed that the RAW USE reported by 'ceph osd df' jumped by the size of the WAL/DB logical volume. This is most noticeable on mostly empty clusters since the extra usage appears more dramatic:

ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 4.00000 1.00000 3.7 TiB 39 GiB 851 MiB 0 B 1 GiB 3.6 TiB 1.03 1.00 116 up
1 hdd 4.00000 1.00000 3.7 TiB 39 GiB 896 MiB 16 KiB 1024 MiB 3.6 TiB 1.03 1.00 117 up
2 hdd 4.00000 1.00000 3.7 TiB 39 GiB 951 MiB 12 KiB 1024 MiB 3.6 TiB 1.04 1.00 125 up
3 hdd 4.00000 1.00000 3.7 TiB 39 GiB 919 MiB 0 B 1 GiB 3.6 TiB 1.03 1.00 123 up
4 hdd 4.00000 1.00000 3.7 TiB 39 GiB 883 MiB 4 KiB 1024 MiB 3.6 TiB 1.03 1.00 111 up

As you can see the DATA/OMAP/META doesn't come close to 39 GiB. This server has 5x 4TB drives and 1x 200GB SSD. Doing the math we can see that each WAL/DB volume was ~37 GiB:

((200.0 * 10**9) / 1024**3) / 5

37.25290298461914

The total size of the OSD also increased by ~37 GiB, but I feel that the RAW usage should accurately report how much is used instead of counting all of WAL/DB.

History

#1 Updated by Greg Farnum about 4 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (ceph cli)

#2 Updated by Tobias Fischer almost 4 years ago

Same here. Fresh Cluster - completely empty. "Raw Use" corresponds to Size of DB+WAL/DB Partition located on separate SSD. Tested with LVM and block device, 14.2.9 and 15.2.2, only DB and DB+WAL - all the same. When creating OSD without separate WAL/DB then "RAW USE" = 1.0 GiB. Is this meant to be like that or is this a Bug? If not a Bug then, IMHO, I find it really confusing!

ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 3.74609 1.00000 3.7 TiB 111 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.89 1.52 0 up
1 hdd 3.74609 1.00000 3.7 TiB 111 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.89 1.52 0 up
2 hdd 3.74609 1.00000 3.7 TiB 111 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.89 1.52 0 up
3 hdd 3.74609 1.00000 3.7 TiB 111 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.89 1.52 0 up
4 hdd 3.74609 1.00000 3.7 TiB 111 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.89 1.52 0 up
5 hdd 3.74609 1.00000 3.7 TiB 111 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.89 1.52 0 up
6 hdd 3.74609 1.00000 3.7 TiB 111 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.89 1.52 0 up
7 hdd 3.74609 1.00000 3.7 TiB 111 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.89 1.52 0 up
8 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
9 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
10 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
11 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
12 hdd 3.68750 1.00000 3.7 TiB 51 GiB 22 MiB 0 B 1 GiB 3.6 TiB 1.35 0.71 0 up
13 hdd 3.68750 1.00000 3.7 TiB 51 GiB 22 MiB 0 B 1 GiB 3.6 TiB 1.35 0.71 0 up
14 hdd 3.68750 1.00000 3.7 TiB 51 GiB 22 MiB 0 B 1 GiB 3.6 TiB 1.35 0.71 0 up
15 hdd 3.68750 1.00000 3.7 TiB 51 GiB 22 MiB 0 B 1 GiB 3.6 TiB 1.35 0.71 0 up
16 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
17 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
18 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
19 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
20 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
21 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
22 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
23 hdd 3.73630 1.00000 3.7 TiB 101 GiB 22 MiB 0 B 1 GiB 3.6 TiB 2.64 1.39 0 up
24 hdd 3.63869 1.00000 3.6 TiB 1.0 GiB 22 MiB 0 B 1 GiB 3.6 TiB 0.03 0.01 0 up
25 hdd 3.63869 1.00000 3.6 TiB 1.0 GiB 22 MiB 0 B 1 GiB 3.6 TiB 0.03 0.01 0 up
26 hdd 3.63869 1.00000 3.6 TiB 1.0 GiB 22 MiB 0 B 1 GiB 3.6 TiB 0.03 0.01 0 up
27 hdd 3.63869 1.00000 3.6 TiB 1.0 GiB 22 MiB 0 B 1 GiB 3.6 TiB 0.03 0.01 0 up
28 hdd 3.63869 1.00000 3.6 TiB 1.0 GiB 22 MiB 0 B 1 GiB 3.6 TiB 0.03 0.01 0 up
29 hdd 3.63869 1.00000 3.6 TiB 1.0 GiB 22 MiB 0 B 1 GiB 3.6 TiB 0.03 0.01 0 up
30 hdd 3.63869 1.00000 3.6 TiB 1.0 GiB 22 MiB 0 B 1 GiB 3.6 TiB 0.03 0.01 0 up
31 hdd 3.63869 1.00000 3.6 TiB 1.0 GiB 22 MiB 0 B 1 GiB 3.6 TiB 0.03 0.01 0 up
TOTAL 119 TiB 2.3 TiB 696 MiB 0 B 32 GiB 116 TiB 1.90

lvdisplay /dev/ceph-db-1/db-1 (DB of OSD 0-7)
--- Logical volume ---
LV Path /dev/ceph-db-1/db-1
LV Name db-1
VG Name ceph-db-1
LV UUID ghXWLw-3w6a-DUin-gtcl-gYg8-Hd3K-iBqoUH
LV Write Access read/write
LV Creation host, time node1, 2020-05-29 21:34:02 +0200
LV Status available # open 12
LV Size 110.00 GiB
Current LE 28160
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 16384
Block device 253:13

lvdisplay /dev/ceph-db-1/db-1 (DB of OSD 8-11)
--- Logical volume ---
LV Path /dev/ceph-db-1/db-1
LV Name db-1
VG Name ceph-db-1
LV UUID jMO6uc-l2TF-Dxpp-mt1V-5eN5-A5jB-F88H4t
LV Write Access read/write
LV Creation host, time node2, 2020-05-30 16:41:21 +0200
LV Status available # open 12
LV Size 100.00 GiB
Current LE 25600
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 16384
Block device 253:13

lvdisplay /dev/ceph-db-2/db-1 (DB of OSD 12-15)
--- Logical volume ---
LV Path /dev/ceph-db-2/db-1
LV Name db-1
VG Name ceph-db-2
LV UUID 4OKYNC-XcwW-vLLQ-Qrc9-73HL-6FBW-98TJ4c
LV Write Access read/write
LV Creation host, time node2, 2020-05-30 16:41:28 +0200
LV Status available # open 12
LV Size 50.00 GiB
Current LE 12800
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 16384
Block device 253:17

ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 119 TiB 116 TiB 2.2 TiB 2.3 TiB 1.90
TOTAL 119 TiB 116 TiB 2.2 TiB 2.3 TiB 1.90

--- POOLS ---
POOL ID STORED OBJECTS USED %USED MAX AVAIL

ceph -s
[..]
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 2.3 TiB used, 116 TiB / 119 TiB avail
pgs:

OSDs 0-15:
ceph-volume lvm create --bluestore --data ceph-block-$i/block-$i --block.db ceph-db-1/db-$i
sdc 8:32 0 3.7T 0 disk
└─ceph--block--1-block--1 253:5 0 3.7T 0 lvm
sdd 8:48 0 3.7T 0 disk
└─ceph--block--2-block--2 253:6 0 3.7T 0 lvm
sde 8:64 0 3.7T 0 disk
└─ceph--block--3-block--3 253:7 0 3.7T 0 lvm
sdf 8:80 0 3.7T 0 disk
└─ceph--block--4-block--4 253:8 0 3.7T 0 lvm
sdg 8:96 0 3.7T 0 disk
└─ceph--block--5-block--5 253:9 0 3.7T 0 lvm
sdh 8:112 0 3.7T 0 disk
└─ceph--block--6-block--6 253:10 0 3.7T 0 lvm
sdi 8:128 0 3.7T 0 disk
└─ceph--block--7-block--7 253:11 0 3.7T 0 lvm
sdj 8:144 0 3.7T 0 disk
└─ceph--block--8-block--8 253:12 0 3.7T 0 lvm
sdk 8:160 0 447.1G 0 disk
├─ceph--db--1-db--1 253:13 0 110G 0 lvm
├─ceph--db--1-db--2 253:14 0 110G 0 lvm
├─ceph--db--1-db--3 253:15 0 110G 0 lvm
└─ceph--db--1-db--4 253:16 0 110G 0 lvm
sdl 8:176 0 447.1G 0 disk
├─ceph--db--2-db--1 253:17 0 110G 0 lvm
├─ceph--db--2-db--2 253:18 0 110G 0 lvm
├─ceph--db--2-db--3 253:19 0 110G 0 lvm
└─ceph--db--2-db--4 253:20 0 110G 0 lvm
---
sdc 8:32 0 3.7T 0 disk
└─ceph--block--1-block--1 253:5 0 3.7T 0 lvm
sdd 8:48 0 3.7T 0 disk
└─ceph--block--2-block--2 253:6 0 3.7T 0 lvm
sde 8:64 0 3.7T 0 disk
└─ceph--block--3-block--3 253:7 0 3.7T 0 lvm
sdf 8:80 0 3.7T 0 disk
└─ceph--block--4-block--4 253:8 0 3.7T 0 lvm
sdg 8:96 0 3.7T 0 disk
└─ceph--block--5-block--5 253:9 0 3.7T 0 lvm
sdh 8:112 0 3.7T 0 disk
└─ceph--block--6-block--6 253:10 0 3.7T 0 lvm
sdi 8:128 0 3.7T 0 disk
└─ceph--block--7-block--7 253:11 0 3.7T 0 lvm
sdj 8:144 0 3.7T 0 disk
└─ceph--block--8-block--8 253:12 0 3.7T 0 lvm
sdk 8:160 0 447.1G 0 disk
├─ceph--db--1-db--1 253:13 0 100G 0 lvm
├─ceph--db--1-db--2 253:14 0 100G 0 lvm
├─ceph--db--1-db--3 253:15 0 100G 0 lvm
└─ceph--db--1-db--4 253:16 0 100G 0 lvm
sdl 8:176 0 447.1G 0 disk
├─ceph--db--2-db--1 253:17 0 50G 0 lvm
├─ceph--db--2-db--2 253:18 0 50G 0 lvm
├─ceph--db--2-db--3 253:19 0 50G 0 lvm
└─ceph--db--2-db--4 253:20 0 50G 0 lvm

OSDs 16-23:
ceph-volume lvm create --bluestore --data ceph-block-$i/block-$i --block.db ceph-db-1/db-$i --block.wal ceph-db-1/wal-$i
lsblk:
sdc 8:32 0 3.7T 0 disk
└─ceph--block--1-block--1 253:5 0 3.7T 0 lvm
sdd 8:48 0 3.7T 0 disk
└─ceph--block--2-block--2 253:6 0 3.7T 0 lvm
sde 8:64 0 3.7T 0 disk
└─ceph--block--3-block--3 253:7 0 3.7T 0 lvm
sdf 8:80 0 3.7T 0 disk
└─ceph--block--4-block--4 253:8 0 3.7T 0 lvm
sdg 8:96 0 3.7T 0 disk
└─ceph--block--5-block--5 253:9 0 3.7T 0 lvm
sdh 8:112 0 3.7T 0 disk
└─ceph--block--6-block--6 253:10 0 3.7T 0 lvm
sdi 8:128 0 3.7T 0 disk
└─ceph--block--7-block--7 253:11 0 3.7T 0 lvm
sdj 8:144 0 3.7T 0 disk
└─ceph--block--8-block--8 253:12 0 3.7T 0 lvm
sdk 8:160 0 447.1G 0 disk
├─ceph--db--1-wal--1 253:13 0 1G 0 lvm
├─ceph--db--1-wal--2 253:14 0 1G 0 lvm
├─ceph--db--1-wal--3 253:15 0 1G 0 lvm
├─ceph--db--1-wal--4 253:16 0 1G 0 lvm
├─ceph--db--1-db--1 253:17 0 100G 0 lvm
├─ceph--db--1-db--2 253:18 0 100G 0 lvm
├─ceph--db--1-db--3 253:19 0 100G 0 lvm
└─ceph--db--1-db--4 253:20 0 100G 0 lvm
sdl 8:176 0 447.1G 0 disk
├─ceph--db--2-wal--1 253:21 0 1G 0 lvm
├─ceph--db--2-wal--2 253:22 0 1G 0 lvm
├─ceph--db--2-wal--3 253:23 0 1G 0 lvm
├─ceph--db--2-wal--4 253:24 0 1G 0 lvm
├─ceph--db--2-db--1 253:25 0 100G 0 lvm
├─ceph--db--2-db--2 253:26 0 100G 0 lvm
├─ceph--db--2-db--3 253:27 0 100G 0 lvm
└─ceph--db--2-db--4 253:28 0 100G 0 lvm

OSDs 24-31:
for i in {c..j}; do ceph-volume lvm create --bluestore --data /dev/sd$i; done
lsblk:
sdc 8:32 0 3.7T 0 disk
└─ceph--fff38d99--2f8d--42e8--a65c--d056d4702719-osd--block--5dcc1c33--d6bc--41fd--b4e8--0970e2e7ea0b 253:5 0 3.7T 0 lvm
sdd 8:48 0 3.7T 0 disk
└─ceph--d6513a8c--3fe7--4a70--9be1--f5cc95916499-osd--block--e25539cb--8072--46db--8431--f515fdb775a8 253:6 0 3.7T 0 lvm
sde 8:64 0 3.7T 0 disk
└─ceph--0e0ad585--7c7c--473b--9ad8--9c30d32c5c8e-osd--block--714a7f30--8a63--4851--88ac--c576949417b0 253:7 0 3.7T 0 lvm
sdf 8:80 0 3.7T 0 disk
└─ceph--d332a35b--4858--4b6b--acfc--31c99b495ba0-osd--block--2f662a75--1553--45cc--b2ee--1bccf095627f 253:8 0 3.7T 0 lvm
sdg 8:96 0 3.7T 0 disk
└─ceph--cea996c8--48f9--4446--b747--3b195b298cd1-osd--block--f070cee5--e0b4--47b7--9409--6ebaaeac1575 253:9 0 3.7T 0 lvm
sdh 8:112 0 3.7T 0 disk
└─ceph--dc48f797--f4c3--4ab3--9a35--8a1a28cfd604-osd--block--a3704850--43a1--4e78--8cbf--0c3884b47586 253:10 0 3.7T 0 lvm
sdi 8:128 0 3.7T 0 disk
└─ceph--d47b167b--72b0--46d9--925a--903070b6f531-osd--block--a489f34c--3de3--474d--91c7--a39409a7a544 253:11 0 3.7T 0 lvm
sdj 8:144 0 3.7T 0 disk
└─ceph--82eee9b8--57b8--4bed--b0df--9aaf6aa2aa0a-osd--block--f14dc06b--3be7--4809--945f--adff12f7f7d5 253:12 0 3.7T 0 lvm

#3 Updated by Igor Fedotov almost 4 years ago

  • Project changed from RADOS to bluestore

Also available in: Atom PDF