Project

General

Profile

Actions

Bug #55534

closed

Persistent write back cache - Error message needs improvement for corrupted cache with appropriate message instead "No space left on device

Added by Preethi Nataraj almost 2 years ago. Updated almost 2 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Description of problem:Persistent write back cache - Error message needs improvement for corrupted cache with appropriate message instead "No space left on device "

Version-Release number of selected component (if applicable):
ceph version 16.2.7-106.el8cp (83a8e200569d52a42ad69374c2d4cfd39921b24d) pacific (stable)
[root@intel-purley-lr-02 pmem]#

How reproducible:

Pre-req
1. Working ceph cluster
2. client node with pemem
3. # ceph config set client rbd_persistent_cache_mode rwl
4. # ceph config set client rbd_plugins pwl_cache

Steps to enable DAX
List the ndctl (must include the pmem as below)
[root@intel-purley-02 tmp]# ndctl list {
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":12681478144,
"uuid":"c5dbfb44-fe3a-42ac-8331-8df3187e7d74",
"sector_size":512,
"align":2097152,
"Blockdev":"pmem0"
}
mkfs.ext4 /dev/pmem0
mount -o dax=always /dev/pmem0 <mountpoint>
And then set rbd_persistent_cache_path to the mountpoint
  1. rbd config global set global rbd_persistent_cache_path path
    After mounting, make sure that DAX is indeed enabled
    Check for something like "EXT4-fs (pmem0): DAX enabled ..." in dmesg

Steps to Reproduce:
1) wite data using RBD bench to pmem/image after few minutes abort, cache file present in path and not flushed to OSDs
2) start FIO write with different pool/image name i.e pmem1/image and then observe the errors

output snippet:

^Coot@intel-purley-lr-02 pmem]# Jobs: 1 (f=0): [/(1),X(1)][-.-%][eta 09m:56s]
fio: io_u error on file test-1.0.0: No space left on device: write offset=4096, buflen=4096
fio: pid=96033, err=28/file:io_u.c:1803, func=io_u error, error=No space left on device
Jobs: 1 (f=1): [f(1),X(1)][-.-%][eta 00m:00s]
test-1: (groupid=0, jobs=2): err=28 (file:io_u.c:1803, func=io_u error, error=No space left on device): pid=96033: Fri Apr 29 06:57:59 2022
cpu : usr=0.00%, sys=0.00%, ctx=10, majf=0, minf=30
IO depths : 1=12.5%, 2=25.0%, 4=50.0%, 8=12.5%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,16,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=8
we are seeing IO error and No apce left on device
this needs manual flush or invalidate cache command

Expected Results:
This is expected. If the corrupted cache is not cleared, it will give out error, the error msg should be more helpful instead of showing user as "no space left on device " which is incorrect

Actions #1

Updated by Deepika Upadhyay almost 2 years ago

  • Description updated (diff)
Actions #2

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from New to Need More Info
  • Assignee set to Ilya Dryomov

Hi Preethi,

How big is the pmem device? A separate cache file gets created for each image, so if the configured cache file size is bigger than roughly half of the pmem device, ENOSPC error is expected.

Actions #3

Updated by jianpeng ma almost 2 years ago

From the Steps to Reproduce:
1) wite data using RBD bench to pmem/image after few minutes abort, cache file present in path and not flushed to OSDs
2) start FIO write with different pool/image name i.e pmem1/image and then observe the errors

This mean previous-write-image-cache file still existed because rbd-bench abort which can't clean the cache-file. So the pmem device will be no space left.
This problem is completely inevitable, regardless of device size. And i think we can only pass ENOSPC to the caller.

Actions #4

Updated by Preethi Nataraj almost 2 years ago

Ilya Dryomov wrote:

Hi Preethi,

How big is the pmem device? A separate cache file gets created for each image, so if the configured cache file size is bigger than roughly half of the pmem device, ENOSPC error is expected.

Pmem device size is 500 GB, cache size created is 400GB. cache file size was not big. even with 1GB+ of cache file we could see this issue.

[root@intel-purley-lr-02 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 189G 0 189G 0% /dev
tmpfs 189G 0 189G 0% /dev/shm
tmpfs 189G 21M 189G 1% /run
tmpfs 189G 0 189G 0% /sys/fs/cgroup
/dev/mapper/rhel_intel--purley--lr--02-root 70G 5.7G 65G 9% /
/dev/sda2 1014M 254M 761M 26% /boot
/dev/mapper/rhel_intel--purley--lr--02-home 1.1T 7.3G 1.1T 1% /home
/dev/sda1 599M 5.8M 594M 1% /boot/efi
tmpfs 38G 4.0K 38G 1% /run/user/1000
/dev/pmem1.1 484G 73M 459G 1% /mnt/pmem
tmpfs 38G 0 38G 0% /run/user/0

Actions #5

Updated by Ilya Dryomov almost 2 years ago

Pmem device size is 500 GB, cache size created is 400GB. cache file size was not big.

400G + 400G is bigger than 500G -- hence ENOSPC error.

even with 1GB+ of cache file we could see this issue.

Please paste all steps (all commands and their output) for 1G cache size test, starting with formatting a fresh filesystem on /dev/pmem1.1.

Actions #6

Updated by Preethi Nataraj almost 2 years ago

Ilya Dryomov wrote:

Pmem device size is 500 GB, cache size created is 400GB. cache file size was not big.

400G + 400G is bigger than 500G -- hence ENOSPC error.

even with 1GB+ of cache file we could see this issue.

Please paste all steps (all commands and their output) for 1G cache size test, starting with formatting a fresh filesystem on /dev/pmem1.1.

***********************************************
Hi,

I have repeated the steps with 1GB cache size and Pmem device of 50GB. No issue seen -Looks like ENOSPC is expected as per the above explanation.

1) RBD bench 32k for some time and abort used pool and image i.e pmem and image1
-cache file is present and not flushed to OSD
2) Run FIO to different image and pool i.e pmem2/image
we see tow different cache file present in the cache path and IO is going fine without any errors
[root@intel-purley-lr-02 pmem1]# ls
rbd-pwl.pmem2.9c696190dcc8b.pool  rbd-pwl.pmem.9b404814d3011.pool
[root@intel-purley-lr-02 pmem1]#

Actions #7

Updated by Ilya Dryomov almost 2 years ago

  • Status changed from Need More Info to Rejected
Actions

Also available in: Atom PDF