Project

General

Profile

Bug #9765

CachePool flush -> OSD Failed

Added by Irek Fasikhov over 5 years ago. Updated over 5 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

Hi,All.

I encountered a problem flushing the data before deleting CachePool.
My crushmap:

[root@ct01 ceph]# ceph osd tree
# id    weight  type name       up/down reweight
-8      3       root cache.bank-hlynov.ru
-5      1               host cache1
13      1                       osd.13  up      1
-6      1               host cache2
31      1                       osd.31  up      1
-7      1               host cache3
30      1                       osd.30  up      1
-1      109.2   root bank-hlynov.ru
-2      36.4            host ct01
2       3.64                    osd.2   up      1
4       3.64                    osd.4   up      1
5       3.64                    osd.5   up      1
6       3.64                    osd.6   up      1
7       3.64                    osd.7   up      1
8       3.64                    osd.8   up      1
9       3.64                    osd.9   up      1
10      3.64                    osd.10  up      1
11      3.64                    osd.11  up      1
0       3.64                    osd.0   up      1
-3      36.4            host ct2
12      3.64                    osd.12  up      1
14      3.64                    osd.14  up      1
15      3.64                    osd.15  up      1
16      3.64                    osd.16  up      1
17      3.64                    osd.17  up      1
18      3.64                    osd.18  up      1
19      3.64                    osd.19  up      1
20      3.64                    osd.20  up      1
21      3.64                    osd.21  up      1
32      3.64                    osd.32  up      1
-4      36.4            host ct3
1       3.64                    osd.1   up      1
3       3.64                    osd.3   up      1
22      3.64                    osd.22  up      1
23      3.64                    osd.23  up      1
24      3.64                    osd.24  up      1
25      3.64                    osd.25  up      1
26      3.64                    osd.26  up      1
27      3.64                    osd.27  up      1
28      3.64                    osd.28  up      1
29      3.64                    osd.29  up      1

My actions:
1. ceph osd tier add rbd cache
2. ceph osd tier cache-mode cache writeback
3. ceph osd tier set-overlay rbd cache
4. sets hit_set_count, hit_set_period and etc.
5. Writes data from a virtual machine:
rados -p cache ls | wc -l
28685
6. ceph osd tier cache-mode cache forward

7.

rados -p cache cache-flush-evict-all
rbd_data.53b752ae8944a.000000000000bebe
rbd_data.53b752ae8944a.000000000000bd11
rbd_data.53b752ae8944a.000000000000c286
rbd_data.53b752ae8944a.000000000000c3f7
rbd_data.45132ae8944a.0000000000001a2c
rbd_data.c1362ae8944a.0000000000017a4c
2014-10-14 11:39:18.183180 7fac98c34700  0 -- 192.168.50.1:0/1032885 >> 192.168.50.1:6807/32153 pipe(0x7fac90012f30 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7fac900131c0).fault
[root@ct01 ceph]# ceph osd stat
     osdmap e25538: 33 osds: 32 up, 33 in
rados -p cache ls | wc -l
   28679

8. OSD dead :(.

[root@ct01 ceph]# ceph osd tree
# id    weight  type name       up/down reweight
-8      3       root cache.bank-hlynov.ru
-5      1               host cache1
*13      1                      osd.13  down    0*
-6      1               host cache2
31      1                       osd.31  up      1
-7      1               host cache3
30      1                       osd.30  up      1
-1      109.2   root bank-hlynov.ru
-2      36.4            host ct01
2       3.64                    osd.2   up      1
4       3.64                    osd.4   up      1
5       3.64                    osd.5   up      1
6       3.64                    osd.6   up      1
7       3.64                    osd.7   up      1
8       3.64                    osd.8   up      1
9       3.64                    osd.9   up      1
10      3.64                    osd.10  up      1
11      3.64                    osd.11  up      1
0       3.64                    osd.0   up      1
-3      36.4            host ct2
12      3.64                    osd.12  up      1
14      3.64                    osd.14  up      1
15      3.64                    osd.15  up      1
16      3.64                    osd.16  up      1
17      3.64                    osd.17  up      1
18      3.64                    osd.18  up      1
19      3.64                    osd.19  up      1
20      3.64                    osd.20  up      1
21      3.64                    osd.21  up      1
32      3.64                    osd.32  up      1
-4      36.4            host ct3
1       3.64                    osd.1   up      1
3       3.64                    osd.3   up      1
22      3.64                    osd.22  up      1
23      3.64                    osd.23  up      1
24      3.64                    osd.24  up      1
25      3.64                    osd.25  up      1
26      3.64                    osd.26  up      1
27      3.64                    osd.27  up      1
28      3.64                    osd.28  up      1
29      3.64                    osd.29  up      1

OS: CentOS 6.5
Kernel: 2.6.32-431.el6.x86_64
Ceph --version: ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)

ceph.conf View (2.2 KB) Irek Fasikhov, 10/14/2014 01:19 AM

ceph-osd.13.log.tar.gz (14.2 MB) Irek Fasikhov, 10/14/2014 01:19 AM

History

#1 Updated by Irek Fasikhov over 5 years ago

in addition:

[root@ct01 ceph]# ceph osd dump | grep pool
pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 stripe_width 0
pool 47 'rbd' replicated size 2 min_size 1 crush_ruleset 2 object_hash rjenkins pg_num 1000 pgp_num 1000 last_change 21328 lfor 21328 tiers 51 read_tier 51 write_tier 51 stripe_width 0
pool 51 'cache' replicated size 2 min_size 1 crush_ruleset 3 object_hash rjenkins pg_num 128 pgp_num 128 last_change 25537 flags hashpspool,incomplete_clones tier_of 47 cache_mode forward target_bytes 200000000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 7200s x1 stripe_width 0

#2 Updated by Irek Fasikhov over 5 years ago

I'm sorry!

[root@ct3 ~]# ceph --version
ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)

#3 Updated by Loic Dachary over 5 years ago

   -24> 2014-10-14 11:39:17.294107 7fec40982700 10 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] repop_all_applied: repop tid 757 all applied 
   -23> 2014-10-14 11:39:17.294113 7fec341b2700 10 osd.13 pg_epoch: 25537 pg[51.65( v 25536'3008473 (25420'3005472,25536'3008473] local-les=25536 n=226 ec=21305 les/c 25536/25536 25535/25535/25535) [13,30] r=0 lpr=25535 crt=25536'3008470 lcod 25536'3008472 mlcod 25536'3008472 active+clean] agent_choose_mode flush_mode: idle evict_mode: idle num_objects: 226 num_bytes: 939794778 num_objects_dirty: 224 num_objects_omap: 0 num_dirty: 224 num_user_objects: 225 pool.info.target_max_bytes: 200000000000 pool.info.target_max_objects: 0
   -22> 2014-10-14 11:39:17.294134 7fec341b2700 20 osd.13 pg_epoch: 25537 pg[51.65( v 25536'3008473 (25420'3005472,25536'3008473] local-les=25536 n=226 ec=21305 les/c 25536/25536 25535/25535/25535) [13,30] r=0 lpr=25535 crt=25536'3008470 lcod 25536'3008472 mlcod 25536'3008472 active+clean] agent_choose_mode dirty 0.596145 full 0.598807
   -21> 2014-10-14 11:39:17.294123 7fec40982700 10 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] eval_repop repgather(0x13749680 25537'4261760 rep_tid=757 committed?=1 applied?=1 op=osd_op(client.353320.0:11 rbd_data.45132ae8944a.0000000000001a2c [cache-evict] 51.cdeef500 ack+read+ignore_cache+skiprwlocks+ignore_overlay e25537) v4) wants=a
   -20> 2014-10-14 11:39:17.294166 7fec341b2700 10 osd.13 pg_epoch: 25537 pg[51.65( v 25536'3008473 (25420'3005472,25536'3008473] local-les=25536 n=226 ec=21305 les/c 25536/25536 25535/25535/25535) [13,30] r=0 lpr=25535 crt=25536'3008470 lcod 25536'3008472 mlcod 25536'3008472 active+clean] sending redirect to pool 47 for op 0x3bd2940
   -19> 2014-10-14 11:39:17.294180 7fec40982700 15 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] log_op_stats osd_op(client.353320.0:11 rbd_data.45132ae8944a.0000000000001a2c [cache-evict] 51.cdeef500 ack+read+ignore_cache+skiprwlocks+ignore_overlay e25537) v4 inb 826 outb 0 rlat 0.000000 lat 0.005433
   -18> 2014-10-14 11:39:17.294184 7fec341b2700  1 -- 192.168.50.1:6807/32153 --> 192.168.50.2:0/1013740 -- osd_op_reply(62454 rbd_data.53b752ae8944a.000000000000dbda [set-alloc-hint object_size 4194304 write_size 4194304,write 0~4194304] v0'0 uv0 ondisk = -2 ((2) No such file or directory) redirect: { object , locator{@47} }) v6 -- ?+0 0x11c82580 con 0x10cdb5a0
   -17> 2014-10-14 11:39:17.294208 7fec40982700 15 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean] publish_stats_to_osd 25537:8292246
   -16> 2014-10-14 11:39:17.294239 7fec341b2700 10 osd.13 25537 dequeue_op 0x3bd2940 finish
   -15> 2014-10-14 11:39:17.294225 7fec40982700 10 osd.13 pg_epoch: 25537 pg[51.0( v 25537'4261760 (25420'4258758,25537'4261760] local-les=25536 n=219 ec=21305 les/c 25536/25536 25535/25535/25535) [13,31] r=0 lpr=25535 crt=25537'4261756 lcod 25537'4261758 mlcod 25537'4261757 active+clean]  sending ack on repgather(0x13749680 25537'4261760 rep_tid=757 committed?=1 applied?=1 op=osd_op(client.353320.0:11 rbd_data.45132ae8944a.0000000000001a2c [cache-evict] 51.cdeef500 ack+read+ignore_cache+skiprwlocks+ignore_overlay e25537) v4) 0x11c85280
   -14> 2014-10-14 11:39:17.294248 7fec40982700  1 -- 192.168.50.1:6807/32153 --> 192.168.50.1:0/1032885 -- osd_op_reply(11 rbd_data.45132ae8944a.0000000000001a2c [cache-evict] v25537'4261760 uv4664758 ack = 0) v6 -- ?+0 0x11c85280 con 0x110f1600
   -13> 2014-10-14 11:39:17.295021 7fec3c3bf700  1 -- 192.168.50.1:6807/32153 <== client.353320 192.168.50.1:0/1032885 12 ==== osd_op(client.353320.0:12 rbd_data.c1362ae8944a.0000000000017a4c [cache-flush] 51.a2a52800 ack+read+ignore_cache+ignore_overlay e25537) v4 ==== 189+0+0 (3027131648 0 0) 0x112baf40 con 0x110f1600
   -12> 2014-10-14 11:39:17.295071 7fec3c3bf700 10 osd.13 25537 do_waiters -- start
   -11> 2014-10-14 11:39:17.295077 7fec3c3bf700 10 osd.13 25537 do_waiters -- finish
   -10> 2014-10-14 11:39:17.295080 7fec3c3bf700 20 osd.13 25537 _dispatch 0x112baf40 osd_op(client.353320.0:12 rbd_data.c1362ae8944a.0000000000017a4c [cache-flush] 51.a2a52800 ack+read+ignore_cache+ignore_overlay e25537) v4
    -9> 2014-10-14 11:39:17.295101 7fec3c3bf700 15 osd.13 25537 require_same_or_newer_map 25537 (i am 25537) 0x112baf40
    -8> 2014-10-14 11:39:17.295107 7fec3c3bf700 20 osd.13 25537 _share_map_incoming client.353320 192.168.50.1:0/1032885 25537
    -7> 2014-10-14 11:39:17.295126 7fec3c3bf700 15 osd.13 25537 enqueue_op 0x3bd12c0 prio 63 cost 0 latency 0.000255 osd_op(client.353320.0:12 rbd_data.c1362ae8944a.0000000000017a4c [cache-flush] 51.a2a52800 ack+read+ignore_cache+ignore_overlay e25537) v4
    -6> 2014-10-14 11:39:17.295152 7fec3c3bf700 10 osd.13 25537 do_waiters -- start
    -5> 2014-10-14 11:39:17.295156 7fec3c3bf700 10 osd.13 25537 do_waiters -- finish
    -4> 2014-10-14 11:39:17.310529 7fec48a8b700  5 osd.13 25537 tick
    -3> 2014-10-14 11:39:17.310579 7fec48a8b700 20 osd.13 25537 scrub_random_backoff lost coin flip, randomly backing off
    -2> 2014-10-14 11:39:17.310585 7fec48a8b700 10 osd.13 25537 do_waiters -- start
    -1> 2014-10-14 11:39:17.310588 7fec48a8b700 10 osd.13 25537 do_waiters -- finish
     0> 2014-10-14 11:39:17.323474 7fec40982700 -1 osd/ReplicatedPG.h: In function 'void ReplicatedPG::release_op_ctx_locks(ReplicatedPG::OpContext*)' thread 7fec40982700 time 2014-10-14 11:39:17.294292
osd/ReplicatedPG.h: 762: FAILED assert(ctx->release_snapset_obc == false)

 ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)
 1: (ReplicatedPG::release_op_ctx_locks(ReplicatedPG::OpContext*)+0x4eb) [0x8ca0bb]
 2: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x2f4) [0x84e664]
 3: (ReplicatedPG::repop_all_applied(ReplicatedPG::RepGather*)+0xab) [0x84f91b]
 4: (Context::complete(int)+0x9) [0x666639]
 5: (ReplicatedBackend::op_applied(ReplicatedBackend::InProgressOp*)+0x28c) [0x983d1c]
 6: (Context::complete(int)+0x9) [0x666639]
 7: (ReplicatedPG::BlessedContext::finish(int)+0x4e) [0x8c810e]
 8: (Context::complete(int)+0x9) [0x666639]
 9: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x95) [0x6707b5]
 10: (Context::complete(int)+0x9) [0x666639]
 11: (Finisher::finisher_thread_entry()+0x1d8) [0x9ec248]
 12: (()+0x79d1) [0x7fec4f2a69d1]
 13: (clone()+0x6d) [0x7fec4e23686d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

#4 Updated by Irek Fasikhov over 5 years ago

On

[root@ct01 ~]# ceph --version
ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)

the same error.

#5 Updated by Irek Fasikhov over 5 years ago

I do not understand. Why data is written to the cache pool - forward

[root@ct3 ~]# ceph osd tier cache-mode cache forward
set cache-mode for pool 'cache' to forward
[root@ct3 ~]# sleep 60
[root@ct3 ~]# ceph osd pool stats cache
pool cache id 51
client io 79460 B/s wr, 11 op/s
[root@ct3 ~]# ceph osd pool stats cache
pool cache id 51
client io 106 kB/s wr, 14 op/s
[root@ct3 ~]# ceph osd pool stats cache
pool cache id 51
client io 101585 B/s wr, 17 op/s

#6 Updated by Sage Weil over 5 years ago

  • Priority changed from High to Urgent
  • Source changed from other to Community (user)

The 'forward' mode means we will modify cached objects in place but forward any 'misses'. It is also possible that the flush/evict operations are be miscounted as writes (there is a metadata update and a deletion).

#7 Updated by Sage Weil over 5 years ago

  • Status changed from New to Duplicate

I'm pretty sure this is because #8629 has not yet been backported to firefly. It should be in 0.80.8. I'll prepare a backport branch to make sure it happens soon.

In the meantime, you can force eviction of the cache pool indirectly but turning target_max_object way down (say, to 1) and letting the agent do the eviction.

Also available in: Atom PDF