Project

General

Profile

Bug #14511

cache-flush-evict-all causes OSD stuck ops on unevictable objects

Added by Hector Martin about 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Cluster is running Ceph 0.94.4.

Start with a pair of empty pools, one erasure, one replicated:

pool 3 'rbd-backup' erasure size 9 min_size 7 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 203 lfor 202 flags hashpspool stripe_width 4256
pool 5 'rbd-backup-cache' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 205 flags hashpspool stripe_width 0
Set up the tier:
$ ceph osd tier add rbd-backup rbd-backup-cache
$ ceph osd tier cache-mode rbd-backup-cache writeback
$ ceph osd tier set-overlay rbd-backup rbd-backup-cache
Create an empty object and give it an omap entry, so it can't be evicted to the erasure pool:
$ rados -p rbd-backup put test /dev/null
$ rados -p rbd-backup setomapval test test test
Try to flush the pool:
$ rados -p rbd-backup-cache cache-flush-evict-all
        test
failed to flush /test: (16) Device or resource busy
error from cache-flush-evict-all: (1) Operation not permitted
Accessing the object now hangs:
$ rados -p rbd-backup getomapval test test

Mon reports blocked requests. Trying cache-flush-evict-all again hangs too. Restarting the OSD holding the object fixes the hang.

I would expect cache-flush-evict-all to evict all evictable objects and complain about those that aren't, but not leave them in limbo. Presumably something is left locked waiting for the objects to be evicted, even though that is impossible.

History

#1 Updated by Sage Weil about 8 years ago

  • Priority changed from Normal to Urgent
  • Source changed from other to Community (user)

Sounds like we didn't remove the object from one of the blocked maps

#3 Updated by Sage Weil about 8 years ago

  • Status changed from New to 7

#4 Updated by Sage Weil about 8 years ago

  • Status changed from 7 to Resolved

Also available in: Atom PDF