Actions
Bug #12449
closedceph-osd core dumped when writing data to the backing storage pool which has a quota set on its cache pool
Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
We have setup our cache tiering under the guidance of this link, with the only exception that we have also set a quota on the cache pool.
??[root@hust17 /home/runsisi]# ceph osd pool ls detail pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 52 flags hashpspool max_bytes 20000000 stripe_width 0 pool 1 'cache' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 125 flags hashpspool,incomplete_clones max_bytes 100000000 tier_of 2 cache_mode writeback target_bytes 100000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0 pool 2 'base' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 63 lfor 63 flags hashpspool tiers 1 read_tier 1 write_tier 1 stripe_width 0??
As you can see, the cache pool, whose name is "cache", has a 100MiB quota set on it, then when we use the rados utility to put a 200MB object to the backing pool, one of the three OSDs core dumped. This coredump won't happen if we clear the quota on the cache pool.
This can be reproduced easily as follows, and the attached file is the log of the core dumped OSD.
??[root@hust17 /home/runsisi]# ceph -s cluster cbc99ef9-fbc3-41ad-a726-47359f8d84b3 health HEALTH_OK monmap e3: 3 mons at {ceph0=192.168.133.10:6789/0,ceph1=192.168.133.11:6789/0,ceph2=192.168.133.12:6789/0} election epoch 6, quorum 0,1,2 ceph0,ceph1,ceph2 osdmap e132: 3 osds: 3 up, 3 in pgmap v413: 192 pgs, 3 pools, 4096 kB data, 1 objects 323 MB used, 131 GB / 131 GB avail 192 active+clean [root@hust17 /home/runsisi]# [root@hust17 /home/runsisi]# ceph osd pool ls detail pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 52 flags hashpspool max_bytes 20000000 stripe_width 0 pool 1 'cache' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 125 flags hashpspool,incomplete_clones max_bytes 100000000 tier_of 2 cache_mode writeback target_bytes 100000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0 pool 2 'base' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 63 lfor 63 flags hashpspool tiers 1 read_tier 1 write_tier 1 stripe_width 0 [root@hust17 /home/runsisi]# rados -p base ls [root@hust17 /home/runsisi]# rados -p cache ls [root@hust17 /home/runsisi]# ll -h 200m.dat -rw-r--r-- 1 root root 200M Jul 23 10:53 200m.dat [root@hust17 /home/runsisi]# [root@hust17 /home/runsisi]# rados -p base put x1 200m.dat [root@hust17 /home/runsisi]# rados -p base put x1 200m.dat error putting base/x1: (28) No space left on device [root@hust17 /home/runsisi]# rados -p base put x1 200m.dat 2015-07-23 20:59:03.262865 7f937a0d7700 0 -- 192.168.133.1:0/1019253 >> 192.168.133.11:6800/20555 pipe(0x3009c90 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x3003170).fault [root@hust17 /home/runsisi]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.11998 root default -2 0.03999 host ceph2 0 0.03999 osd.0 up 1.00000 1.00000 -3 0.03999 host ceph1 1 0.03999 osd.1 down 1.00000 1.00000 -4 0.03999 host ceph0 2 0.03999 osd.2 up 1.00000 1.00000 [root@hust17 /home/runsisi]# ceph --version ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)??
Files
Updated by runsisi hust over 8 years ago
Sorry for the inconvenience, how can i edit the issue description?
Updated by Kefu Chai over 8 years ago
Sorry for the inconvenience, how can i edit the issue description?
runsisi,
- click "Update" at the right side of the top banner.
- click "Description" (the small pencil icon) in the "Change properties" form.
Updated by Alexey Sheplyakov over 8 years ago
Updated by Loïc Dachary over 8 years ago
- Is duplicate of Bug #13098: OSD crashed when reached pool's max_bytes quota added
Actions