Bug #12717
closed
pool's statistic data not updated after doing Cache evict operation
Added by huang jun over 8 years ago.
Updated over 8 years ago.
Description
We setup a cache tier and set target_max_bytes to 10240,
After put 2 objects(each 40MB) by 'rados put' to the cache pool,
which will trigger the cache pool's agent_evict operation
the ceph df and rados df shows the cache pool still have 2 objects,
but the 'rados ls' shows there is no objects.
It's because that in agent_maybe_evict(), we create a repop without op,
in the eval_repop()
, the repop->all_committed
is 0, so it will not execute publish_stats_to_osd()
,
// ondisk?
if (repop->all_committed) {
if (repop->ctx->op && !repop->log_op_stat) {
log_op_stats(repop->ctx);
repop->log_op_stat = true;
}
publish_stats_to_osd();
- Description updated (diff)
- Status changed from New to Fix Under Review
- Source changed from other to Community (user)
huangjun,
i am trying to repeat your test:
$ ceph osd pool create slow 1 1
$ ceph osd pool set fast target_max_objects 1
$ ceph osd pool set fast target_max_bytes 20
$ ceph osd pool set fast hit_set_count 1
$ ls -lh /tmp/doc.tgz
-rw-r--r-- 1 kefu kefu 2.8M Aug 19 16:41 /tmp/doc.tgz
$ rados -p slow put obj3 /tmp/doc.tgz
$ rados -p slow put obj4 /tmp/doc.tgz
$ ./rados df
pool name KB objects clones degraded unfound rd rd KB wr wr KB
cephfs_data 0 0 0 0 0 0 0 0 0
cephfs_metadata 5 54 0 0 0 0 0 117 17
fast 1 1 0 0 0 0 0 4 5578
rbd 0 0 0 0 0 0 0 0 0
slow 5577 2 0 0 0 0 0 2 5578
total used 137587004 57
total avail 409409908
total space 576342804
$ ./ceph df
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
549G 390G 131G 23.87
POOLS:
NAME ID USED %USED MAX AVAIL OBJECTS
rbd 0 0 0 130G 0
cephfs_data 1 0 0 130G 0
cephfs_metadata 2 4196 0 130G 54
slow 3 5576k 0 130G 2
fast 4 84 0 130G 1
$ ./rados -p fast ls
$ ./rados -p fast ls -N .ceph-internal
hit_set_4.0_archive_2015-08-19 11:09:38.600516Z_2015-08-19 11:09:48.705156Z
$ ./rados -p slow ls
obj4
obj3
so the only object left in the cache tier is the hitset archive object.
It's because that in agent_maybe_evict(), we create a repop without op,
i doubt it. _delete_oid(ctx, true)
is called in this function. and it does create stash/remove op. am i missing anything?
hi,kefu
My ceph version is 0.80.7 maybe a bit older,
I will try it in ceph newest master branch.
I thinks you should put objects to pool 'fast' not 'slow', bc 'fast' is a cache pool
huangjun,
i think if we are using the "fast" pool as the cache tier of the "slow" pool. it is supposed to be a transparent layer over the "slow" one, hence we should put using the "slow" pool.
hmm, but the agent_cache_evict and agent_cache_flush is used only in cache pool, the base pool will not do this.
hi,kefu
I think https://github.com/ceph/ceph/commit/78b1fb5e9440dcd4ef99301c3ac857385e870cf3 have resolved this.
In version 0.80.7 if there is no op in repop, then the eval_repop will do nothing, so there is no chance to call publish_stats_to_osd() to update pg stats.
In current master branch, if repops are all_committed, it will call publish_stats_to_osd(), so the
pg stats will update in osd::tick().
So this maybe a replicated bug.
what's your opinion?
- Assignee set to Kefu Chai
in the newest master branch, it works as expected,
the pool status was updated after evict operation,
i think you can close this issue.
sorry for the replicated issue report.
- Status changed from Fix Under Review to Duplicate
Also available in: Atom
PDF