Project

General

Profile

Actions

Bug #8182

closed

After rados bench on tiered pool can't remove objects

Added by David Zafman about 10 years ago. Updated about 10 years ago.

Status:
Rejected
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I'm based on the firefly branch with changes to the tiering agent code which shouldn't affect this test.

$ ./rados -p cache bench 10 write --no-cleanup
...
$ ./rados -p cache rm benchmark_data_ubuntu-dzvm_40166_object0
error removing cache/benchmark_data_ubuntu-dzvm_40166_object0: (2) No such file or directory

inline int ReplicatedPG::_delete_oid(OpContext *ctx, bool no_whiteout) {
....
if (!obs.exists || (obs.oi.is_whiteout() && !no_whiteout))
return -ENOENT;

obs.exists true, is_whiteout() true and no_whiteout == false
so it returned -ENOENT

The file exists in the cache tier which is pool 3:

dev/osd0/current/3.3_head/benchmark\udata\uubuntu-dzvm\u40166\uobject0__head_51EAEA03__3

Details:


Breakpoint 1, ReplicatedPG::_delete_oid (this=0x33cc000, ctx=0x60d4600, no_whiteout=false) at osd/ReplicatedPG.cc:4550
4550    {
(gdb) n
4557      if (!obs.exists || (obs.oi.is_whiteout() && !no_whiteout))
(gdb) n
4550    {
(gdb) n
4555      PGBackend::PGTransaction* t = ctx->op_t;
(gdb) n
4558        return -ENOENT;
(gdb) p obs
$1 = (ObjectState &) @0x60d4648: {oi = {soid = {oid = {name = "benchmark_data_ubuntu-dzvm_40166_object0"}, snap = {val = 18446744073709551614}, hash = 1374349827, max = false, static POOL_IS_TEMP = -1, pool = 3, nspace =
    "", key = ""}, category = "", version = {version = 263, epoch = 28, __pad = 0}, prior_version = {version = 0, epoch = 0, __pad = 0}, user_version = 0, last_reqid = {name = {_type = 4 '\004', _num = 0,
        static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static NEW = -1}, tid = 1188, inc = 0}, size = 0, mtime = {tv = {tv_sec = 1398217037, tv_nsec = 827721000}},
    flags = object_info_t::FLAG_WHITEOUT, wrlock_by = {name = {_type = 0 '\000', _num = 0, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static NEW = -1}, tid = 0, inc = 0},
    snaps = std::vector of length 0, capacity 0, truncate_seq = 0, truncate_size = 0, watchers = std::map with 0 elements}, exists = true}
(gdb) p no_whiteout
$2 = false

Actions #1

Updated by David Zafman about 10 years ago

My trace is after a subsequent removal attempt which wouldn't be clear from the description. Now that I look at the code it may have gotten further the first time through _delete_oid() and did this:

  // cache: writeback: set whiteout on delete?
  if (pool.info.cache_mode == pg_pool_t::CACHEMODE_WRITEBACK && !no_whiteout) {
    dout(20) << __func__ << " setting whiteout on " << soid << dendl;
    oi.set_flag(object_info_t::FLAG_WHITEOUT);
    ctx->delta_stats.num_whiteouts++;
    t->touch(soid);
    osd->logger->inc(l_osd_tier_whiteout);
    return 0;
  }
Actions #2

Updated by David Zafman about 10 years ago

  • Status changed from New to Rejected
Actions

Also available in: Atom PDF