Project

General

Profile

Support #51985

Help-seeking for rbd persistent cache usage problems

Added by chunsong feng over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
% Done:

100%

Tags:
persistent cache
Reviewed:
Affected Versions:
Pull request ID:

Description

The performance of the rbd persistent cache is tested. The fio is used to test the three modes of the ceph persistent write-back cache function: ssd, rwl, and disabled. The performance difference is similar. No file is generated in the rbd_persistent_cache_path directory.
Set the parameters as follows:
rbd_cache = true
rbd_cache_writethrough_until_flush = false
rbd_cache_size = 268435456
rbd_cache_max_dirty = 134217728
rbd cache target dirty = 33554432
rbd cache max dirty age = 5
debug_rbd=5/5
rbd_parent_cache_enabled = true
rbd plugins = parent_cache

rbd_cache_policy = writeback
rbd_cache_block_writes_upfront = true
rbd plugins = pwl_cache

rbd_persistent_cache_size =322122547200
rbd_persistent_cache_log_periodic_stats = true

The ssd mode is configured.
rbd_persistent_cache_mode = ssd
rbd_persistent_cache_path = /mnt/rbd0
  1. Mount Options
    #/dev/nvme1n1p1 on /mnt/rbd0 type xfs (rw, relatime, attr2, inode64, noquota)
The rwl mode is configured.
rbd_persistent_cache_mode = rwl
rbd_persistent_cache_path = /mnt/ramcache
  1. Mount Options
    xfs on /mnt/ramcache type ramfs (rw, relatime)

The logs are as follows:
2021-07-30T12:03:38.208+0800 7f110e30d700 5 librbd::PluginRegistry: 0x5627e0f7f7a0 init: attempting to load plugin: pwl_cache
2021-07-30T12:03:38.211+0800 7f110e30d700 5 librbd::plugin::WriteLogImageCache: 0x7f11000b2000 init:
You can see that WriteLogImageCache is initialized

Configure the disabled mode.
2021-07-30T12:55:51.547+0800 7faf9f7fe700 5 librbd::PluginRegistry: 0x565082e8c670 init: attempting to load plugin: pwl_cache
WriteLogImageCache is not initialized.
How to enable the rbd persistent cache function?

rwl.txt View (310 KB) chunsong feng, 08/05/2021 03:17 AM

ssd.txt View (298 KB) chunsong feng, 08/05/2021 03:17 AM

History

#2 Updated by CONGMIN YIN over 2 years ago

  • Assignee set to CONGMIN YIN

1. Please check the CEPH you installed contains the PWL plug-in.

ls /usr/local/lib/ceph/librbd/libceph_librbd_pwl_cache.so*

Also, please tell me how you installed CEPH, Deb package or compile the source code?

2.Under normal conditions, when using PWL cache, RBD cache should be set to off: rbd cache = false
But that's not why you can't enable pwl cache.

#3 Updated by chunsong feng over 2 years ago

1.contains the PWL plug-in
[root@ceph1 ~]# ps axu | grep fio
root 732406 0.0 0.0 394468 24112 pts/4 S 11:08 0:00 fio -S
root 733975 254 0.2 2404188 1163572 pts/4 Sl+ 11:10 0:05 fio -ioengine=rbd -clientname=admin -pool=rbdtest -size=20G -direct=1 -bs=4k -numjobs=1 -rbdname=image11 -rw=write -iodepth=64 -name=job
root 734086 0.0 0.0 12108 988 pts/9 S+ 11:11 0:00 grep --color=auto fio
[root@ceph1 ~]# pmap -p 733975| grep rbd
733975: fio -ioengine=rbd -clientname=admin -pool=rbdtest -size=20G -direct=1 -bs=4k -numjobs=1 -rbdname=image11 -rw=write -iodepth=64 -name=job
00007fc064085000 2712K r-x-
/usr/lib64/ceph/librbd/libceph_librbd_pwl_cache.so.1.0.0
00007fc06432b000 2048K ----- /usr/lib64/ceph/librbd/libceph_librbd_pwl_cache.so.1.0.0
00007fc06452b000 60K r---- /usr/lib64/ceph/librbd/libceph_librbd_pwl_cache.so.1.0.0
00007fc06453a000 4K rw--- /usr/lib64/ceph/librbd/libceph_librbd_pwl_cache.so.1.0.0
00007fc0c06a7000 8768K r-x-- /usr/lib64/librbd.so.1.16.0
00007fc0c0f37000 2044K ----- /usr/lib64/librbd.so.1.16.0
00007fc0c1136000 156K r---- /usr/lib64/librbd.so.1.16.0
00007fc0c115d000 36K rw--- /usr/lib64/librbd.so.1.16.0

#4 Updated by chunsong feng over 2 years ago

I checked with pmap that the libceph_librbd_pwl_cache plug-in was loaded.
I use perf to sample rwl and ssd tests. Can you check whether the persistent cache function takes effect based on hot functions?

#5 Updated by chunsong feng over 2 years ago

Previously, we tested the rbd persistent cache based on ceph16.2.5 and found that the performance was the same.
Today, based on ceph 16.2.5, I replaced src/librbd/cache on the master and then sampled it with perf.

#6 Updated by CONGMIN YIN over 2 years ago

The log shows that your PWL plug-in did not load successfully. Please describe how you installed ceph. And what kind of benchmark do you use?

My configure:

[client]
    #admin_socket = /mnt/pmem1/cache/cache.asok
    rbd_cache = false
    #debug rbd_pwl = 1
    #log_file = /var/log/ceph/rbd.log
    rbd_persistent_cache_mode = rwl
    rbd_plugins = pwl_cache
    rbd_persistent_cache_size = 10737418240
    rbd_persistent_cache_path = /mnt/pmem1/cache

/dev/pmem1 on /mnt/pmem1 type xfs (rw,relatime,attr2, dax ,inode64,noquota)
(dax is just for pmem, not the reason you cann't enable pwl)

image fearure exclusive-lock is default, but check it.
root@ssp-ceph105:~# rbd info ceph105/image1
rbd image 'image1':
size 1 GiB in 256 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 211df614c39b
block_name_prefix: rbd_data.211df614c39b
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
op_features:
flags:
create_timestamp: Fri Jul 16 06:54:31 2021
access_timestamp: Fri Jul 16 06:54:31 2021
modify_timestamp: Tue Jul 27 02:35:26 2021

use benchmark like fio or rbd bench to test, cache file will in rbd_persistent_cache_path

#7 Updated by chunsong feng over 2 years ago

I tested it with fio.

The command for creating an image is as follows. The exclusive-lock feature is not enabled.
rbd create image11 --size 20480 --pool rbdtest --image-format 2 --image-feature layering
rbd image 'image11':
size 20 GiB in 5120 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 16393835822c15
block_name_prefix: rbd_data.16393835822c15
format: 2
features: layering
op_features:
flags:
create_timestamp: Thu Jul 29 20:13:51 2021
access_timestamp: Mon Aug 2 17:00:41 2021
modify_timestamp: Thu Aug 5 11:35:01 2021

I will test the image with exclusive-lock feature:
rbd create image100 --size 20480 --pool rbdtest --image-format 2 --image-feature layering,exclusive-lock,object-map,fast-diff,deep-flatten
rbd image 'image100':
size 20 GiB in 5120 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 18abbd3085dd13
block_name_prefix: rbd_data.18abbd3085dd13
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
op_features:
flags:
create_timestamp: Thu Aug 5 11:41:42 2021
access_timestamp: Thu Aug 5 11:41:42 2021
modify_timestamp: Thu Aug 5 11:41:42 2021

#8 Updated by chunsong feng over 2 years ago

I took ceph source code v16.2.5 and used
Compile the rpmbuild --with=rbd_rwl_cache --with=rbd_ssd_cache -bb ceph.spec, and then install the.
In the compilation environment, ensure that the include/acconfig.h file has been defined.
/* Define if RWL is enabled */
#define WITH_RBD_RWL

/* Define if PWL-SSD is enabled */
#define WITH_RBD_SSD_CACHE

#9 Updated by CONGMIN YIN over 2 years ago

After test the image with exclusive-lock feature, can you enable the rwl cache?

#10 Updated by chunsong feng over 2 years ago

I have tested the image with exclusive-lock feature, the rwl cache is enabled.

19/5000 拼音 翻译
many cache files has been generated in the rbd_persistent_cache_path.
rwl
[root@ceph1 ceph]# ll h /mnt/ramcache
total 370G
---------
1 root root 8.8G Aug 5 13:04 rbd-pwl.rbdtest.1887296bc71ad8.pool
---------- 1 root root 9.7G Aug 5 13:02 rbd-pwl.rbdtest.188738295fdff5.pool
---------- 1 root root 7.1G Aug 5 13:02 rbd-pwl.rbdtest.1888466bcad1c.pool
---------- 1 root root 7.6G Aug 5 13:04 rbd-pwl.rbdtest.1888889b74aa5a.pool
---------- 1 root root 8.3G Aug 5 13:03 rbd-pwl.rbdtest.1888e54ae62afe.pool

ssd
[root@ceph1 ceph]# ll /mnt/rbd0 h
total 8.0K
-rw-r--r-
1 root root 300G Aug 5 14:27 rbd-pwl.rbdtest.1887296bc71ad8.pool
rw-r--r- 1 root root 300G Aug 5 14:27 rbd-pwl.rbdtest.188738295fdff5.pool
rw-r--r- 1 root root 300G Aug 5 14:27 rbd-pwl.rbdtest.1888466bcad1c.pool
rw-r--r- 1 root root 300G Aug 5 14:27 rbd-pwl.rbdtest.1888889b74aa5a.pool

Thanks a lot.

#11 Updated by CONGMIN YIN over 2 years ago

  • % Done changed from 0 to 100

OK,if you have other questions, please let me know.

#12 Updated by CONGMIN YIN over 2 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF