Bug #8935
closedoperations not idempotent when enabling cache
0%
Description
consider:
- no cache
- send delete to base tier
- base does delete, replies
- mon set overlay
- client resends delete to cache
- (client discards original delete reply)
- cache promotes whiteout (object doesn't exist any more)
- cache replies with -ENOENT
i'm not quite sure how this should work. perhaps this is part of the more general problem of using the pg log to enforce idempotency. perhaps, instead, it should be a shorter per-object history that gets promoted. even that, though, doesn't handle the deletion case...
here is the job that reproduces this (just hit it 1 out of 10 attempts):
tasks: - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - slow request - exec: client.0: - ceph osd pool create base 4 - ceph osd pool create cache 4 - ceph osd tier add base cache - ceph osd tier cache-mode cache writeback - ceph osd tier set-overlay base cache - ceph osd pool set cache hit_set_type bloom - ceph osd pool set cache hit_set_count 8 - ceph osd pool set cache hit_set_period 60 - ceph osd pool set cache target_max_objects 500 - background_exec: mon.a: - while true - do sleep 30 - echo forward - ceph osd tier cache-mode cache forward - sleep 10 - ceph osd pool set cache cache_target_full_ratio .001 - echo cache-try-flush-evict-all - rados -p cache cache-try-flush-evict-all - sleep 5 - echo cache-flush-evict-all - rados -p cache cache-flush-evict-all - sleep 5 - echo remove overlay - ceph osd tier remove-overlay base - sleep 20 - echo add writeback overlay - ceph osd tier cache-mode cache writeback - ceph osd pool set cache cache_target_full_ratio .8 - ceph osd tier set-overlay base cache - done - rados: clients: - client.0 max_seconds: 600 objects: 10000 op_weights: copy_from: 50 delete: 50 read: 100 write: 100 ops: 400000 pools: - base size: 1024 overrides: ceph: conf: global: ms inject socket failures: 500 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 osd op thread timeout: 60 osd sloppy crc: true fs: btrfs log-whitelist: - slow request install: ceph: branch: wip-8931 roles: - - mon.a - osd.0 - osd.1 - osd.2 - - osd.3 - osd.4 - osd.5 - client.0
Updated by Sage Weil almost 10 years ago
if we have both the pg log and an op list in object_info_t, then we can have a rados op that returns the 'recent reqids' for an object based on the combination of both. and we could do that during promote so that the cache layer is 'primed' with the same history for that object (where the pg log deletes will get moved into the object_info_t reqids for the whiteout). that op could be combined with the copy-get op used during promote, or just shoved into the same rados request.
Updated by Greg Farnum almost 10 years ago
I think you're right that a per-object log would be needed to solve this problem — and I think that means we shouldn't even try. Right now it would explode the OSD's memory use, and (besides us knowing about this sort of cache-change inconsistency issue when we set it up) users should not be cycling cache modes.
Updated by Sage Weil almost 10 years ago
sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410527 and 410528
Updated by Samuel Just over 9 years ago
ubuntu@teuthology:/a/samuelj-2014-12-05_23:56:18-rados-wip-sam-firefly-testing-wip-testing-vanilla-fixes-basic-multi/639049/remote