Bug #9765
closedCachePool flush -> OSD Failed
0%
Description
Hi,All.
I encountered a problem flushing the data before deleting CachePool.
My crushmap:
[root@ct01 ceph]# ceph osd tree # id weight type name up/down reweight -8 3 root cache.bank-hlynov.ru -5 1 host cache1 13 1 osd.13 up 1 -6 1 host cache2 31 1 osd.31 up 1 -7 1 host cache3 30 1 osd.30 up 1 -1 109.2 root bank-hlynov.ru -2 36.4 host ct01 2 3.64 osd.2 up 1 4 3.64 osd.4 up 1 5 3.64 osd.5 up 1 6 3.64 osd.6 up 1 7 3.64 osd.7 up 1 8 3.64 osd.8 up 1 9 3.64 osd.9 up 1 10 3.64 osd.10 up 1 11 3.64 osd.11 up 1 0 3.64 osd.0 up 1 -3 36.4 host ct2 12 3.64 osd.12 up 1 14 3.64 osd.14 up 1 15 3.64 osd.15 up 1 16 3.64 osd.16 up 1 17 3.64 osd.17 up 1 18 3.64 osd.18 up 1 19 3.64 osd.19 up 1 20 3.64 osd.20 up 1 21 3.64 osd.21 up 1 32 3.64 osd.32 up 1 -4 36.4 host ct3 1 3.64 osd.1 up 1 3 3.64 osd.3 up 1 22 3.64 osd.22 up 1 23 3.64 osd.23 up 1 24 3.64 osd.24 up 1 25 3.64 osd.25 up 1 26 3.64 osd.26 up 1 27 3.64 osd.27 up 1 28 3.64 osd.28 up 1 29 3.64 osd.29 up 1
My actions:
1. ceph osd tier add rbd cache
2. ceph osd tier cache-mode cache writeback
3. ceph osd tier set-overlay rbd cache
4. sets hit_set_count, hit_set_period and etc.
5. Writes data from a virtual machine:
rados -p cache ls | wc -l
28685
6. ceph osd tier cache-mode cache forward
7.
rados -p cache cache-flush-evict-all rbd_data.53b752ae8944a.000000000000bebe rbd_data.53b752ae8944a.000000000000bd11 rbd_data.53b752ae8944a.000000000000c286 rbd_data.53b752ae8944a.000000000000c3f7 rbd_data.45132ae8944a.0000000000001a2c rbd_data.c1362ae8944a.0000000000017a4c 2014-10-14 11:39:18.183180 7fac98c34700 0 -- 192.168.50.1:0/1032885 >> 192.168.50.1:6807/32153 pipe(0x7fac90012f30 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7fac900131c0).fault [root@ct01 ceph]# ceph osd stat osdmap e25538: 33 osds: 32 up, 33 in rados -p cache ls | wc -l 28679
8. OSD dead :(.
[root@ct01 ceph]# ceph osd tree # id weight type name up/down reweight -8 3 root cache.bank-hlynov.ru -5 1 host cache1 *13 1 osd.13 down 0* -6 1 host cache2 31 1 osd.31 up 1 -7 1 host cache3 30 1 osd.30 up 1 -1 109.2 root bank-hlynov.ru -2 36.4 host ct01 2 3.64 osd.2 up 1 4 3.64 osd.4 up 1 5 3.64 osd.5 up 1 6 3.64 osd.6 up 1 7 3.64 osd.7 up 1 8 3.64 osd.8 up 1 9 3.64 osd.9 up 1 10 3.64 osd.10 up 1 11 3.64 osd.11 up 1 0 3.64 osd.0 up 1 -3 36.4 host ct2 12 3.64 osd.12 up 1 14 3.64 osd.14 up 1 15 3.64 osd.15 up 1 16 3.64 osd.16 up 1 17 3.64 osd.17 up 1 18 3.64 osd.18 up 1 19 3.64 osd.19 up 1 20 3.64 osd.20 up 1 21 3.64 osd.21 up 1 32 3.64 osd.32 up 1 -4 36.4 host ct3 1 3.64 osd.1 up 1 3 3.64 osd.3 up 1 22 3.64 osd.22 up 1 23 3.64 osd.23 up 1 24 3.64 osd.24 up 1 25 3.64 osd.25 up 1 26 3.64 osd.26 up 1 27 3.64 osd.27 up 1 28 3.64 osd.28 up 1 29 3.64 osd.29 up 1
OS: CentOS 6.5
Kernel: 2.6.32-431.el6.x86_64
Ceph --version: ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
Files
Updated by Irek Fasikhov over 9 years ago
in addition:
[root@ct01 ceph]# ceph osd dump | grep pool pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 stripe_width 0 pool 47 'rbd' replicated size 2 min_size 1 crush_ruleset 2 object_hash rjenkins pg_num 1000 pgp_num 1000 last_change 21328 lfor 21328 tiers 51 read_tier 51 write_tier 51 stripe_width 0 pool 51 'cache' replicated size 2 min_size 1 crush_ruleset 3 object_hash rjenkins pg_num 128 pgp_num 128 last_change 25537 flags hashpspool,incomplete_clones tier_of 47 cache_mode forward target_bytes 200000000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 7200s x1 stripe_width 0
Updated by Irek Fasikhov over 9 years ago
I'm sorry!
[root@ct3 ~]# ceph --version
ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)
Updated by Loïc Dachary over 9 years ago
-24> 2014-10-14 11:39:17.294107 7fec40982700 10 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] repop_all_applied: repop tid 757 all applied -23> 2014-10-14 11:39:17.294113 7fec341b2700 10 osd.13 pg_epoch: 25537 pg[51.65( v 25536'3008473 (25420'3005472,25536'3008473] local-les=25536 n=226 ec=21305 les/c 25536/25536 25535/25535/25535) [13,30] r=0 lpr=25535 crt=25536'3008470 lcod 25536'3008472 mlcod 25536'3008472 active+clean] agent_choose_mode flush_mode: idle evict_mode: idle num_objects: 226 num_bytes: 939794778 num_objects_dirty: 224 num_objects_omap: 0 num_dirty: 224 num_user_objects: 225 pool.info.target_max_bytes: 200000000000 pool.info.target_max_objects: 0 -22> 2014-10-14 11:39:17.294134 7fec341b2700 20 osd.13 pg_epoch: 25537 pg[51.65( v 25536'3008473 (25420'3005472,25536'3008473] local-les=25536 n=226 ec=21305 les/c 25536/25536 25535/25535/25535) [13,30] r=0 lpr=25535 crt=25536'3008470 lcod 25536'3008472 mlcod 25536'3008472 active+clean] agent_choose_mode dirty 0.596145 full 0.598807 -21> 2014-10-14 11:39:17.294123 7fec40982700 10 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] eval_repop repgather(0x13749680 25537'4261760 rep_tid=757 committed?=1 applied?=1 op=osd_op(client.353320.0:11 rbd_data.45132ae8944a.0000000000001a2c [cache-evict] 51.cdeef500 ack+read+ignore_cache+skiprwlocks+ignore_overlay e25537) v4) wants=a -20> 2014-10-14 11:39:17.294166 7fec341b2700 10 osd.13 pg_epoch: 25537 pg[51.65( v 25536'3008473 (25420'3005472,25536'3008473] local-les=25536 n=226 ec=21305 les/c 25536/25536 25535/25535/25535) [13,30] r=0 lpr=25535 crt=25536'3008470 lcod 25536'3008472 mlcod 25536'3008472 active+clean] sending redirect to pool 47 for op 0x3bd2940 -19> 2014-10-14 11:39:17.294180 7fec40982700 15 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] log_op_stats osd_op(client.353320.0:11 rbd_data.45132ae8944a.0000000000001a2c [cache-evict] 51.cdeef500 ack+read+ignore_cache+skiprwlocks+ignore_overlay e25537) v4 inb 826 outb 0 rlat 0.000000 lat 0.005433 -18> 2014-10-14 11:39:17.294184 7fec341b2700 1 -- 192.168.50.1:6807/32153 --> 192.168.50.2:0/1013740 -- osd_op_reply(62454 rbd_data.53b752ae8944a.000000000000dbda [set-alloc-hint object_size 4194304 write_size 4194304,write 0~4194304] v0'0 uv0 ondisk = -2 ((2) No such file or directory) redirect: { object , locator{@47} }) v6 -- ?+0 0x11c82580 con 0x10cdb5a0 -17> 2014-10-14 11:39:17.294208 7fec40982700 15 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] publish_stats_to_osd 25537:8292246 -16> 2014-10-14 11:39:17.294239 7fec341b2700 10 osd.13 25537 dequeue_op 0x3bd2940 finish -15> 2014-10-14 11:39:17.294225 7fec40982700 10 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] sending ack on repgather(0x13749680 25537'4261760 rep_tid=757 committed?=1 applied?=1 op=osd_op(client.353320.0:11 rbd_data.45132ae8944a.0000000000001a2c [cache-evict] 51.cdeef500 ack+read+ignore_cache+skiprwlocks+ignore_overlay e25537) v4) 0x11c85280 -14> 2014-10-14 11:39:17.294248 7fec40982700 1 -- 192.168.50.1:6807/32153 --> 192.168.50.1:0/1032885 -- osd_op_reply(11 rbd_data.45132ae8944a.0000000000001a2c [cache-evict] v25537'4261760 uv4664758 ack = 0) v6 -- ?+0 0x11c85280 con 0x110f1600 -13> 2014-10-14 11:39:17.295021 7fec3c3bf700 1 -- 192.168.50.1:6807/32153 <== client.353320 192.168.50.1:0/1032885 12 ==== osd_op(client.353320.0:12 rbd_data.c1362ae8944a.0000000000017a4c [cache-flush] 51.a2a52800 ack+read+ignore_cache+ignore_overlay e25537) v4 ==== 189+0+0 (3027131648 0 0) 0x112baf40 con 0x110f1600 -12> 2014-10-14 11:39:17.295071 7fec3c3bf700 10 osd.13 25537 do_waiters -- start -11> 2014-10-14 11:39:17.295077 7fec3c3bf700 10 osd.13 25537 do_waiters -- finish -10> 2014-10-14 11:39:17.295080 7fec3c3bf700 20 osd.13 25537 _dispatch 0x112baf40 osd_op(client.353320.0:12 rbd_data.c1362ae8944a.0000000000017a4c [cache-flush] 51.a2a52800 ack+read+ignore_cache+ignore_overlay e25537) v4 -9> 2014-10-14 11:39:17.295101 7fec3c3bf700 15 osd.13 25537 require_same_or_newer_map 25537 (i am 25537) 0x112baf40 -8> 2014-10-14 11:39:17.295107 7fec3c3bf700 20 osd.13 25537 _share_map_incoming client.353320 192.168.50.1:0/1032885 25537 -7> 2014-10-14 11:39:17.295126 7fec3c3bf700 15 osd.13 25537 enqueue_op 0x3bd12c0 prio 63 cost 0 latency 0.000255 osd_op(client.353320.0:12 rbd_data.c1362ae8944a.0000000000017a4c [cache-flush] 51.a2a52800 ack+read+ignore_cache+ignore_overlay e25537) v4 -6> 2014-10-14 11:39:17.295152 7fec3c3bf700 10 osd.13 25537 do_waiters -- start -5> 2014-10-14 11:39:17.295156 7fec3c3bf700 10 osd.13 25537 do_waiters -- finish -4> 2014-10-14 11:39:17.310529 7fec48a8b700 5 osd.13 25537 tick -3> 2014-10-14 11:39:17.310579 7fec48a8b700 20 osd.13 25537 scrub_random_backoff lost coin flip, randomly backing off -2> 2014-10-14 11:39:17.310585 7fec48a8b700 10 osd.13 25537 do_waiters -- start -1> 2014-10-14 11:39:17.310588 7fec48a8b700 10 osd.13 25537 do_waiters -- finish 0> 2014-10-14 11:39:17.323474 7fec40982700 -1 osd/ReplicatedPG.h: In function 'void ReplicatedPG::release_op_ctx_locks(ReplicatedPG::OpContext*)' thread 7fec40982700 time 2014-10-14 11:39:17.294292 osd/ReplicatedPG.h: 762: FAILED assert(ctx->release_snapset_obc == false) ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae) 1: (ReplicatedPG::release_op_ctx_locks(ReplicatedPG::OpContext*)+0x4eb) [0x8ca0bb] 2: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x2f4) [0x84e664] 3: (ReplicatedPG::repop_all_applied(ReplicatedPG::RepGather*)+0xab) [0x84f91b] 4: (Context::complete(int)+0x9) [0x666639] 5: (ReplicatedBackend::op_applied(ReplicatedBackend::InProgressOp*)+0x28c) [0x983d1c] 6: (Context::complete(int)+0x9) [0x666639] 7: (ReplicatedPG::BlessedContext::finish(int)+0x4e) [0x8c810e] 8: (Context::complete(int)+0x9) [0x666639] 9: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x95) [0x6707b5] 10: (Context::complete(int)+0x9) [0x666639] 11: (Finisher::finisher_thread_entry()+0x1d8) [0x9ec248] 12: (()+0x79d1) [0x7fec4f2a69d1] 13: (clone()+0x6d) [0x7fec4e23686d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Irek Fasikhov over 9 years ago
On
[root@ct01 ~]# ceph --version
ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
the same error.
Updated by Irek Fasikhov over 9 years ago
I do not understand. Why data is written to the cache pool - forward
[root@ct3 ~]# ceph osd tier cache-mode cache forward
set cache-mode for pool 'cache' to forward
[root@ct3 ~]# sleep 60
[root@ct3 ~]# ceph osd pool stats cache
pool cache id 51
client io 79460 B/s wr, 11 op/s
[root@ct3 ~]# ceph osd pool stats cache
pool cache id 51
client io 106 kB/s wr, 14 op/s
[root@ct3 ~]# ceph osd pool stats cache
pool cache id 51
client io 101585 B/s wr, 17 op/s
Updated by Sage Weil over 9 years ago
- Priority changed from High to Urgent
- Source changed from other to Community (user)
The 'forward' mode means we will modify cached objects in place but forward any 'misses'. It is also possible that the flush/evict operations are be miscounted as writes (there is a metadata update and a deletion).
Updated by Sage Weil over 9 years ago
- Status changed from New to Duplicate
I'm pretty sure this is because #8629 has not yet been backported to firefly. It should be in 0.80.8. I'll prepare a backport branch to make sure it happens soon.
In the meantime, you can force eviction of the cache pool indirectly but turning target_max_object way down (say, to 1) and letting the agent do the eviction.