Project

General

Profile

Actions

Feature #8230

open

mds: new requests are not throttled by backend rados progress

Added by Alexandre Oliva about 10 years ago. Updated almost 8 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
Introspection/Control
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Reviewed:
Affected Versions:
Component(FS):
Labels (FS):
Pull request ID:

Description

The use of the ceph.parent attribute as actual cephfs metadata, including its use for recovery and hard links, has turned what was just a pain point into a reason for real concern.

The pain point is that the batchy use of ceph.parent by the mds, both when creating lots of files and when recovering the filesystem, slows the cluster down significantly. It is very common for me, when exploding a tarball with lots of small files or after an mds start, to see my data osds reporting long lists of slow requests. When file creation is sustained, it is very common for such osds to end up hitting a suicide timeout and die, slowing things down further. At first I though this had to do with inefficiencies in btrfs's xattrs, but stopping ceph from using xattrs, so that ceph.parent was stored in leveldb, didn't fare any better.

The reason for concern is that, when such failures occur, replication count of the ceph.parent metadata attributes stored in datapools goes below the replication level I set for metadata. I keep metadata at smaller, faster disks, and at higher replication counts, but holding ceph.parent in data pools defeats both intents, of higher reliability and of faster access.

Have you considered storing ceph.parent attributes of data objects in the metadata pool, perhaps under the same object id used for the data, either as an attribute or as the data proper?

Considering that xattrs are metadata, it might make sense for all other xattrs to be held there, too.

Actions

Also available in: Atom PDF