Bug #53758
closedfsync so slow
0%
Description
Docker Image: quay.io/cephcsi/cephcsi:v3.2.2
ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable)
fsync takes seconds on rbd.
Is there any way to determine the cause?
...
root@localhost# time dd if=/dev/urandom of=temp01.bin bs=512 count=1000 status=progress conv=fsync
1000+0 records in
1000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 3.7859 s, 135 kB/s
root@localhost# time dd if=/dev/urandom of=temp01.bin bs=512 count=10000 status=progress conv=fsync
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB, 4.9 MiB) copied, 6.10873 s, 838 kB/s
root@localhost# fio -ioengine=libaio -name=test -filename=aaaaa.bin -fsync=1 -direct=1 -bs=4k -iodepth=32 -rw=write -size=1M
test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
fio-3.16
Starting 1 process
Jobs: 1 (f=1): [W(1)][10.9%][w=4KiB/s][w=1 IOPS][eta 00m:57s]
Updated by jichan lee over 2 years ago
- rados bench -p k8s-default-class-c-blockpool 10 write -b 4M -t 16
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_rook-ceph-tools-599d59ffcc-x9_28741
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 16 0 0 0 - 0
2 16 26 10 19.9987 20 1.83676 1.52928
3 16 35 19 25.3313 36 2.80031 1.70759
4 16 44 28 27.9975 36 2.4809 1.76429
5 16 54 38 30.3969 40 1.25239 1.71289
6 16 62 46 30.6633 32 1.43621 1.72136
7 16 74 58 33.1392 48 0.87739 1.64062
8 16 80 64 31.9897 24 1.93673 1.63696
9 16 88 72 31.9903 32 1.24941 1.69071
10 16 93 77 30.7911 20 2.31739 1.77738
11 16 93 77 27.9923 0 - 1.77738
12 15 93 78 25.9931 2 3.33216 1.79732
13 5 93 88 27.0699 40 3.21164 2.02457
14 2 93 91 25.9935 12 5.75782 2.13017
Total time run: 14.034
Total writes made: 93
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 26.5071
Stddev Bandwidth: 15.9649
Max bandwidth (MB/sec): 48
Min bandwidth (MB/sec): 0
Average IOPS: 6
Stddev IOPS: 4.04711
Max IOPS: 12
Min IOPS: 0
Average Latency(s): 2.20728
Stddev Latency(s): 1.28793
Max latency(s): 6.23165
Min latency(s): 0.760782
Cleaning up (deleting benchmark objects)
Removed 93 objects
Clean up completed and total clean up time :14.6214
Updated by jichan lee over 2 years ago
Fixed.
One OSD was abnormal. But it was not marked as SLOW_OPS, so it was difficult to check on the dashboard.
Updated by jichan lee over 2 years ago
dd if=/dev/urandom of=target_disk oflag=direct bs=4096 count=1000 status=progress
3944448 bytes (3.9 MB, 3.8 MiB) copied, 14 s, 282 kB/s
1000+0 records in
1000+0 records out
4096000 bytes (4.1 MB, 3.9 MiB) copied, 14.4186 s, 284 kB/s