Bug #8641
closedCache tiering agent cannot flush or evict objects during the benchmark
0%
Description
I set target_max_objects to 1000, but it does not evict objects during creation of workload. It does not even start after the benchmark is finished. It only starts eviction when I execute "ceph osd pool set hot-storage target_max_objects 1000" again either during the test (that causes OSD to be down and I have to restart OSD) or after benchmark.
Files
Updated by Samuel Just almost 10 years ago
- Status changed from New to Need More Info
How did you initially set target_max_objects, please provide more detail.
Updated by Sherry Shahbazi almost 10 years ago
Samuel Just wrote:
How did you initially set target_max_objects, please provide more detail.
The attachment is my CRUSH map, ceph osd lspools shows:
0 metadata, 1 tier1-cache, 2 tier1
Then I set the ruleset for each pool as follows:
ceph osd pool set metadata crush_ruleset 0
ceph osd pool set tier1-cache crush_ruleset 1
ceph osd pool set tier1 crush_ruleset 2
After all I did the following steps to add tiering:
1) ceph osd tier add tier1 tier1-cache
2) ceph osd tier cache-mode tier1-cache writeback
3) ceph osd tier set-overlay tier1 tier1-cache
4) ceph osd pool set tier1-cache target_max_objects 1000
Then I set the CephFS in kernel:
ceph mds newfs 0 2 --yes-i-really-mean-it
sudo mkdir /mnt/oruafs
sudo mount -t ceph ceph-mon1:6789,ceph-mon2:6789,ceph-mon3:6789:/ /mnt/oruafs -o name=admin
sudo mkdir /mnt/oruafs/tier1
cephfs /mnt/oruafs/tier1 set_layout -p 2
sudo mount -t ceph ceph-mon1:6789,ceph-mon2:6789,ceph-mon3:6789:/tier1 /mnt/oruafs/tier1 -o name=admin
Once I start the test, I noticed that all the objects would stay at tier1-cache and will not be evicted. tier1-cache should evict objects when it reaches 1000 objects!
Updated by David Zafman almost 10 years ago
When I was experimenting with tiering during development, I ran into this issue when the value of target_max_objects is smaller than the total PGs in the pool. The algorithm won't work in that case. So say you have 1024 PGs in the pool, I wouldn't set target_max_objects much lower than 10 times that or 10240, for example. For target_max_objects of 10240 each PG attempts to have 10 objects in it. This gives you a reasonable granularity in 10% increments in terms of osd_agent_min_evict_effort.
In a production environment this shouldn't be a problem since the maximum objects wouldn't be set that low.
Updated by Sherry Shahbazi almost 10 years ago
I have only 128 in that tier1-cache pool. Based on what you are saying, setting target_max_objects to 10 times greater than 128 = 1280 should work. I should add that I first set the target_max_objects to 100,000 then 10,000 and neither worked. I also set the other parameters in cache tiering like cache_target_full_ratio to 0.8 before. But eventually my OSDs in cache tier pool become full and down as object agent could not flush objects.
Updated by Sherry Shahbazi almost 10 years ago
David Zafman wrote:
When I was experimenting with tiering during development, I ran into this issue when the value of target_max_objects is smaller than the total PGs in the pool. The algorithm won't work in that case. So say you have 1024 PGs in the pool, I wouldn't set target_max_objects much lower than 10 times that or 10240, for example. For target_max_objects of 10240 each PG attempts to have 10 objects in it. This gives you a reasonable granularity in 10% increments in terms of osd_agent_min_evict_effort.
In a production environment this shouldn't be a problem since the maximum objects wouldn't be set that low.
I have only 128 PGs in tier1-cache pool. Based on what you are saying, setting target_max_objects to 10 times greater than 128 = 1280 should work. I should add that I first set the target_max_objects to 100,000 then 10,000 and neither worked. I also set the other parameters in cache tiering like cache_target_full_ratio to 0.8 before. But eventually my OSDs in cache tier pool become full and down as object agent could not flush objects.
Updated by Samuel Just almost 10 years ago
I think you need add-cache rather than set-overlay.
Updated by Samuel Just almost 10 years ago
Where in the docs did you see that bit?
Updated by Sherry Shahbazi almost 10 years ago
Samuel Just wrote:
I think you need add-cache rather than set-overlay.
Based on the following link, I need to set-overlay when the cache-mode is writeback.
http://ceph.com/docs/master/rados/operations/cache-tiering/#creating-a-cache-tier
Updated by Sherry Shahbazi almost 10 years ago
Samuel Just wrote:
Where in the docs did you see that bit?
I also followed what Greg told me in his reply to my email related to CephFS:
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg10764.html
But I think when the objects come from CephFS, RADOS is not able to handle that.
Updated by Sherry Shahbazi almost 10 years ago
Samuel Just wrote:
What kernel version are you using?
It's 3.14 as Yan Zheng suggested since I couldn't mount CephFS with kernel version 3.12.
Updated by Szymon Zacher over 9 years ago
In my opinion problem affect also cache_min_evict_age cache_min_flush_age and others. It's impossible to force ceph cache to flush or evict objects regularly.
ceph version 0.80.4 (7c241cfaa6c8c068bc9da8578ca00b9f4fc7567f)
Updated by Sage Weil over 9 years ago
- Status changed from Need More Info to Can't reproduce