Project

General

Profile

Bug #19512

Sparse file info in filestore not propagated to other OSDs

Added by Piotr Dalek almost 7 years ago. Updated over 4 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
-
Category:
Performance/Resource Usage
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We recently had an interesting issue with RBD images and filestore on Jewel 10.2.5:
We have a pool with RBD images, all of them mostly untouched (large areas of those images unused), and once we added 3 new OSDs to cluster, objects representing these images grew substantially on new OSDs: objects hosting unused areas of these images on original OSDs remained small (~8K of space actually used, 4M allocated), but on new OSDs were large (4M allocated and actually used). After investigation we concluded that Ceph didn't propagate sparse file information during cluster rebalance, resulting in correct data contents on all OSDs, but no sparse file data on new OSDs, hence disk space usage increase on those.

Example on test cluster, before growing it by one OSD:

ls:

osd-01-cluster: rw-r--r- 1 root root 4194304 Apr 6 09:18 /var/lib/ceph/osd-01-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0
osd-02-cluster: rw-r--r- 1 root root 4194304 Apr 6 09:18 /var/lib/ceph/osd-02-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0
osd-03-cluster: rw-r--r- 1 root root 4194304 Apr 6 09:18 /var/lib/ceph/osd-03-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0

du:

osd-01-cluster: 12 /var/lib/ceph/osd-01-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0
osd-02-cluster: 12 /var/lib/ceph/osd-02-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0
osd-03-cluster: 12 /var/lib/ceph/osd-03-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0

mon-01-cluster:~ # rbd diff test
Offset Length Type
8388608 4194304 data
16777216 4096 data
33554432 4194304 data
37748736 2048 data

And after growing it:

ls:

clush> find /var/lib/ceph/osd-*/current/0.*head/ type f -name '*data*' -exec ls -l {} \+
osd-02-cluster: -rw-r--r-
1 root root 4194304 Apr 6 09:18 /var/lib/ceph/osd-02-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0
osd-03-cluster: rw-r--r- 1 root root 4194304 Apr 6 09:18 /var/lib/ceph/osd-03-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0
osd-04-cluster: rw-r--r- 1 root root 4194304 Apr 6 09:25 /var/lib/ceph/osd-04-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0

du:

clush> find /var/lib/ceph/osd-*/current/0.*head/ -type f -name '*data*' -exec du -k {} \+
osd-02-cluster: 12 /var/lib/ceph/osd-02-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0
osd-03-cluster: 12 /var/lib/ceph/osd-03-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0
osd-04-cluster: 4100 /var/lib/ceph/osd-04-cluster/current/0.27_head/rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0

Note that "rbd\udata.12a474b0dc51.0000000000000008__head_2DD64767__0" grew from 12 to 4100KB when copied from other OSDs to osd-04.

History

#1 Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category changed from OSD to Performance/Resource Usage
  • Priority changed from Urgent to High
  • Component(RADOS) OSD added

#2 Updated by Piotr Dalek over 6 years ago

Enabled FIEMAP/SEEK_HOLE in QA here: https://github.com/ceph/ceph/pull/15939

#3 Updated by Josh Durgin over 4 years ago

  • Status changed from New to Won't Fix

If this is still an issue in bluestore, let's fix it there.

Also available in: Atom PDF