Project

General

Profile

Actions

Bug #57539

open

crimson osd not showing correct object count

Added by Aravind Ramesh over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I was running some tests on a 1TB regular SSD with crimson-osd/seastore (using BlockSegmentManager, I had to modify the crimson.yaml to alter the device size and segment size to get it running on this device.)

I was able to start a cluster with 1 crimson-osd, using vstart.sh.
Then I created a pool and ran “rados bench -p <poolname> <time-to-run> write -t 2 –no-cleanup” , below is a output for 30 second run.

@
sudo ./bin/ceph osd pool create conv-bench 100 100;

sudo ./bin/rados bench -p conv-bench 30 write -t 2 --no-cleanup
Total time run: 30.0443
Total writes made: 7722
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1028.08
Stddev Bandwidth: 97.3374
Max bandwidth (MB/sec): 1120
Min bandwidth (MB/sec): 700
Average IOPS: 257
Stddev IOPS: 24.3344
Max IOPS: 280
Min IOPS: 175
Average Latency(s): 0.00777462
Stddev Latency(s): 0.00337417
Max latency(s): 0.0485986
Min latency(s): 0.00530812
@

It ran successfully and reported writing 7722 objects.
But these objects are not reflected in the cluster information or the osd information. Below it shows only 99 objects. But it updates the amount of storage used correctly(31GB here).
@
$ sudo ./bin/ceph -s
cluster:
id: d5cb02b7-c9eb-49df-b7cf-be99d892e513
health: HEALTH_WARN
1 pool(s) do not have an application enabled
4 pool(s) have no replicas configured
1 pool(s) have non-power-of-two pg_num
services:
mon: 1 daemons, quorum a (age 8m)
mgr: x(active, since 8m)
mds: 1/1 daemons up
osd: 1 osds: 1 up (since 8m), 1 in (since 8m)
data:
volumes: 1/1 healthy
pools: 4 pools, 165 pgs
objects: 99 objects, 301 MiB
usage: 31 GiB used, 928 GiB / 959 GiB avail
pgs: 165 active+clean

$ sudo ./bin/rados df -p conv-bench
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
conv-bench 300 MiB 75 0 75 0 0 0 0 0 B 75 300 MiB 0 B 0 B

total_objects 99
total_used 31 GiB
total_avail 928 GiB
total_space 959 GiB
@

But if do a purge, it shows 7807 objects were deleted.

$ sudo ./bin/rados purge conv-bench --yes-i-really-really-mean-it
Warning: using slow linear search
Removed 7807 objects
successfully purged pool conv-bench

But rados df still shows that 31GB is being used.

@
$ sudo ./bin/rados df -p conv-bench
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
conv-bench 300 MiB 75 0 75 0 0 0 0 0 B 75 300 MiB 0 B 0 B
total_objects 99
total_used 31 GiB
total_avail 928 GiB
total_space 959 GiB

@

I see the same behavior with ZNS drives also.

However, it is not the case with Legacy-OSD with bluestore. I tested with legacy-osd also.
I could see that the cluster information showed correct object count information. And purge also reflected correct object and storage information.

No data to display

Actions

Also available in: Atom PDF