Project

General

Profile

Bug #14511

cache-flush-evict-all causes OSD stuck ops on unevictable objects

Added by Hector Martin over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
Start date:
01/26/2016
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Cluster is running Ceph 0.94.4.

Start with a pair of empty pools, one erasure, one replicated:

pool 3 'rbd-backup' erasure size 9 min_size 7 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 203 lfor 202 flags hashpspool stripe_width 4256
pool 5 'rbd-backup-cache' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 205 flags hashpspool stripe_width 0
Set up the tier:
$ ceph osd tier add rbd-backup rbd-backup-cache
$ ceph osd tier cache-mode rbd-backup-cache writeback
$ ceph osd tier set-overlay rbd-backup rbd-backup-cache
Create an empty object and give it an omap entry, so it can't be evicted to the erasure pool:
$ rados -p rbd-backup put test /dev/null
$ rados -p rbd-backup setomapval test test test
Try to flush the pool:
$ rados -p rbd-backup-cache cache-flush-evict-all
        test
failed to flush /test: (16) Device or resource busy
error from cache-flush-evict-all: (1) Operation not permitted
Accessing the object now hangs:
$ rados -p rbd-backup getomapval test test

Mon reports blocked requests. Trying cache-flush-evict-all again hangs too. Restarting the OSD holding the object fixes the hang.

I would expect cache-flush-evict-all to evict all evictable objects and complain about those that aren't, but not leave them in limbo. Presumably something is left locked waiting for the objects to be evicted, even though that is impossible.

History

#1 Updated by Sage Weil over 3 years ago

  • Priority changed from Normal to Urgent
  • Source changed from other to Community (user)

Sounds like we didn't remove the object from one of the blocked maps

#3 Updated by Sage Weil over 3 years ago

  • Status changed from New to Testing

#4 Updated by Sage Weil over 3 years ago

  • Status changed from Testing to Resolved

Also available in: Atom PDF