Actions
Bug #49597
openmds: mds goes to 'replay' state after setting 'osd_failsafe_ratio' to less than size of data written.
Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Steps to reproduce on vstart cluster:
1. Set the following in ../src/vstart.sh
1. Disable client_check_pool_perm
[global]
client check pool perm = false
2. Set the bluestore block size to 10G
bluestore block size = 10737418240
2. Start vstart cluster as below
#env MDS=1 OSD=1 MON=1 ../src/vstart.sh -d -b -n --without-dashboard
3. Create a subvolume and write around 5G file
#bin/ceph fs subvolume create a sub_0
#subvol_path=$(bin/ceph fs subvolume getpath a sub_0 2>dev/null)
#bin/ceph-fuse -c ./ceph.conf /mnt
#dd if=/dev/urandom of=/mnt$subvol_path/5GB_file-1 status=progress bs=1M count=5000
4. Set osd ratios as below:
#bin/ceph osd set-full-ratio 0.2
#bin/ceph osd set-nearfull-ratio 0.16
#bin/ceph osd set-backfillfull-ratio 0.18
5. Removing the subvolume should return ENOSPACE:
#bin/ceph fs subvolume rm a sub_0
DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH
2021-03-04T15:24:02.674+0530 7fa9ec672700 -1 WARNING: all dangerous and experimental features are enabled.
2021-03-04T15:24:02.698+0530 7fa9ec672700 -1 WARNING: all dangerous and experimental features are enabled.
Error ENOSPC: error in setxattr
6. Output of 'osd df'
#bin/ceph osd df
DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH
2021-03-04T15:25:43.136+0530 7f619257c700 -1 WARNING: all dangerous and experimental features are enabled.
2021-03-04T15:25:43.152+0530 7f619257c700 -1 WARNING: all dangerous and experimental features are enabled.
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 ssd 0.01070 1.00000 11 GiB 5.9 GiB 4.9 GiB 0 B 27 MiB 5.1 GiB 53.48 1.00 192 up
TOTAL 11 GiB 5.9 GiB 4.9 GiB 0 B 27 MiB 5.1 GiB 53.48
MIN/MAX VAR: 1.00/1.00 STDDEV: 0
7. Edit the ./ceph.conf to set osd failsafe ratio to 0.5 (This can also be set at run time but ceph.conf takes precedence)
osd failsafe full ratio = .5
8. Stop and start the same cluster.
#../src/stop.sh
#env MDS=1 OSD=1 MON=1 ../src/vstart.sh -d -b --without-dashboard
9. Check that mds would stuck in `replay` state.
#bin/ceph fs status
DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH
2021-03-04T15:30:27.122+0530 7f2515951700 -1 WARNING: all dangerous and experimental features are enabled.
2021-03-04T15:30:27.138+0530 7f2515951700 -1 WARNING: all dangerous and experimental features are enabled.
a - 0 clients
RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
0 replay a 0 0 0 0
POOL TYPE USED AVAIL
cephfs.a.meta metadata 188k 0
cephfs.a.data data 5000M 0
MDS version: ceph version Development (no_version) quincy (dev)
10. Now all mgr commands hang, waiting for the cephfs mount.
Files
Updated by Kotresh Hiremath Ravishankar about 3 years ago
- File logs_except_osd.tar.gz logs_except_osd.tar.gz added
Updated by Kotresh Hiremath Ravishankar about 3 years ago
- File osd_last10k.tar.gz osd_last10k.tar.gz added
Updated by Patrick Donnelly about 3 years ago
- Priority changed from Normal to High
- Target version set to v17.0.0
- Source set to Development
Actions