Bug #8935: operations not idempotent when enabling cache - Ceph - Ceph

Actions

Copy link

Bug #8935

closed

operations not idempotent when enabling cache

Added by Sage Weil almost 10 years ago. Updated about 9 years ago.

Status:

Resolved

Priority:

High

Assignee:

Category:

OSD

Target version:

% Done:

Source:

Development

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

consider:

- no cache
- send delete to base tier
- base does delete, replies
- mon set overlay
- client resends delete to cache
- (client discards original delete reply)
- cache promotes whiteout (object doesn't exist any more)
- cache replies with -ENOENT

i'm not quite sure how this should work. perhaps this is part of the more general problem of using the pg log to enforce idempotency. perhaps, instead, it should be a shorter per-object history that gets promoted. even that, though, doesn't handle the deletion case...

here is the job that reproduces this (just hit it 1 out of 10 attempts):

tasks:
- chef: null
- clock.check: null
- install: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - slow request
- exec:
    client.0:
    - ceph osd pool create base 4
    - ceph osd pool create cache 4
    - ceph osd tier add base cache
    - ceph osd tier cache-mode cache writeback
    - ceph osd tier set-overlay base cache
    - ceph osd pool set cache hit_set_type bloom
    - ceph osd pool set cache hit_set_count 8
    - ceph osd pool set cache hit_set_period 60
    - ceph osd pool set cache target_max_objects 500
- background_exec:
    mon.a:
    - while true
    - do sleep 30
    - echo forward
    - ceph osd tier cache-mode cache forward
    - sleep 10
    - ceph osd pool set cache cache_target_full_ratio .001
    - echo cache-try-flush-evict-all
    - rados -p cache cache-try-flush-evict-all
    - sleep 5
    - echo cache-flush-evict-all
    - rados -p cache cache-flush-evict-all
    - sleep 5
    - echo remove overlay
    - ceph osd tier remove-overlay base
    - sleep 20
    - echo add writeback overlay
    - ceph osd tier cache-mode cache writeback
    - ceph osd pool set cache cache_target_full_ratio .8
    - ceph osd tier set-overlay base cache
    - done
- rados:
    clients:
    - client.0
    max_seconds: 600
    objects: 10000
    op_weights:
      copy_from: 50
      delete: 50
      read: 100
      write: 100
    ops: 400000
    pools:
    - base
    size: 1024
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 500
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
        osd op thread timeout: 60
        osd sloppy crc: true
    fs: btrfs
    log-whitelist:
    - slow request
  install:
    ceph:
      branch: wip-8931
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - osd.3
  - osd.4
  - osd.5
  - client.0

Actions

Copy link

Updated by Sage Weil almost 10 years ago

if we have both the pg log and an op list in object_info_t, then we can have a rados op that returns the 'recent reqids' for an object based on the combination of both. and we could do that during promote so that the cache layer is 'primed' with the same history for that object (where the pg log deletes will get moved into the object_info_t reqids for the whiteout). that op could be combined with the copy-get op used during promote, or just shoved into the same rados request.

Actions

Copy link

Updated by Greg Farnum almost 10 years ago

I think you're right that a per-object log would be needed to solve this problem — and I think that means we shouldn't even try. Right now it would explode the OSD's memory use, and (besides us knowing about this sort of cache-change inconsistency issue when we set it up) users should not be cycling cache modes.

Actions

Copy link