Bug #36408
closed[cache tier] failed guarded write + promotion results in "success" op result
0%
Description
Simple reproducer:
ceph osd pool create cache 4 ceph osd tier add rbd cache ceph osd tier cache-mode cache writeback ceph osd tier set-overlay rbd cache ceph osd pool set cache hit_set_type bloom ceph osd pool set cache hit_set_count 8 ceph osd pool set cache hit_set_period 60 ceph osd pool set cache target_max_objects 250 rbd namespace create rbd ns rbd create rbd/ns/test -s 1G rbd namespace remove rbd ns rbd: namespace contains images which must be deleted first. rados -p cache -N ns cache-flush rbd_directory rados -p cache -N ns cache-evict rbd_directory rbd namespace remove rbd ns && echo FAILURE
If an object doesn't exist on the cache tier but a guarded write operation will cause it to be proxied and promoted, there is a chance that the proxy-write operation will be canceled and restarted when the promotion completes. However, when the original op is restarted against the now promoted object, the dup-op detection incorrectly compares the object against object version zero since the copy-from reqids do not include the result code (only requester and version):
In the logs below, a guarded write from the client kicks off a proxied write and promotion:
2018-10-11 13:18:19.493 7f1363796700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] do_op: op osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] do_op osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 may_write may_read -> write-ordered flags ondisk+write+known_if_redirected 2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] get_object_context: found obc in cache: 0xfbe1080 2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] get_object_context: 0xfbe1080 2:774f7988:ns::rbd_directory:head rwstate(none n=0 w=0) oi: 2:774f7988:ns::rbd_directory:head(0'0 unknown.0.0:0 s 0 uv 0 alloc_hint [0 0 0]) exists: 0 ssc: 0x10070f20 snapset: 0=[]:{} 2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] find_object_context 2:774f7988:ns::rbd_directory:head @head oi=2:774f7988:ns::rbd_directory:head(0'0 unknown.0.0:0 s 0 uv 0 alloc_hint [0 0 0]) 2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] agent_choose_mode flush_mode: idle evict_mode: idle num_objects: 9 num_bytes: 678 num_objects_dirty: 0 num_objects_omap: 1 num_dirty: 0 num_user_objects: 1 num_user_bytes: 300 num_overhead_bytes: 300 pool.info.target_max_bytes: 0 pool.info.target_max_objects: 250 2018-10-11 13:18:19.494 7f1363796700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] agent_choose_mode dirty 0 full 0.016129 2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] do_proxy_write Start proxy write for osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 2018-10-11 13:18:19.494 7f1363796700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] maybe_promote missing_oid 2:774f7988:ns::rbd_directory:head in_hit_set 1 2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] start_copy 2:774f7988:ns::rbd_directory:head from 2:774f7988:ns::rbd_directory:head @1;ns v0 flags 30 mirror_snapset 2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] _copy_some obc(2:774f7988:ns::rbd_directory:head(dne) rwstate(none n=0 w=0)) 0xfc4be10
The base tier receives the proxied guarded write:
2018-10-11 13:18:19.494 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op: op osd_op(client.14323.0:1 1.6 1:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+ignore_cache+ignore_overlay+known_if_redirected e33) v8 2018-10-11 13:18:19.494 7f1361792700 10 osd.0 33 class rbd method dir_state_set flags=rw 2018-10-11 13:18:19.494 7f136579a700 20 osd.0 op_wq(6) _process 1.6 to_process <> waiting <> waiting_peering {} 2018-10-11 13:18:19.494 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] op_has_sufficient_caps session=0xfd26500 pool=1 (rbd ns) pool_app_metadata={} need_read_cap=0 need_write_cap=1 classes=[class rbd method dir_state_set rd 1 wr 1 wl 1] -> yes 2018-10-11 13:18:19.494 7f136579a700 20 osd.0 op_wq(6) _process OpQueueItem(1.6 PGOpItem(op=osd_op(osd.0.32:113 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8) prio 63 cost 0 e33) queued 2018-10-11 13:18:19.494 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op osd_op(client.14323.0:1 1.6 1:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+ignore_cache+ignore_overlay+known_if_redirected e33) v8 may_write may_read -> write-ordered flags ondisk+write+ignore_cache+ignore_overlay+known_if_redirected 2018-10-11 13:18:19.494 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: found obc in cache: 0x1000adc0 2018-10-11 13:18:19.494 7f1386000700 15 osd.0 33 enqueue_op 0xfc91520 prio 63 cost 29 latency 0.000111 epoch 33 osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 2018-10-11 13:18:19.494 7f1386000700 20 osd.0 op_wq(6) _enqueue OpQueueItem(1.6 PGOpItem(op=osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8) prio 63 cost 29 e33) 2018-10-11 13:18:19.494 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: 0x1000adc0 1:774f7988:ns::rbd_directory:head rwstate(none n=0 w=0) oi: 1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) exists: 1 ssc: 0xfcde160 snapset: 0=[]:{} 2018-10-11 13:18:19.494 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] find_object_context 1:774f7988:ns::rbd_directory:head @head oi=1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) 2018-10-11 13:18:19.494 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op obc obc(1:774f7988:ns::rbd_directory:head rwstate(excl n=1 w=0)) 2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 0x100d6000 2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 1:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] ov 33'17 av 33'18 snapc 0=[] snapset 0=[]:{} 2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] 2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op call rbd.dir_state_set 2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] call method rbd.dir_state_set
While the guarded write is in-flight, the base tier starts to process the promotion ops:
2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [stat] 2018-10-11 13:20:51.848 7f13847fd700 10 osd.0 33 ping from client.14323 2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op stat 2018-10-11 13:20:51.848 7f13847fd700 10 osd.0 33 ping from client.14323 2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] stat oi has 0 2018-10-11 13:18:01.980537 2018-10-11 13:20:51.848 7f13847fd700 10 osd.0 33 ping from client.14323 2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [omap-get-vals] 2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op omap-get-vals
... then the guarded write fails:
2018-10-11 13:20:51.849 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] Found key name_test 2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] method called response length=0 2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] calc_trim_to limit = 33'17 2018-10-11 13:20:51.849 7f1386000700 10 osd.0 33 ping from osd.0 2018-10-11 13:20:51.849 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx update_log_only -- result=-16 2018-10-11 13:20:51.849 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] record_write_error r=-16 2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] submit_log_entries 33'18 (0'0) error 1:774f7988:ns::rbd_directory:head by client.14323.0:1 0.000000 -16 2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] new_repop: repgather(0xfbc8600 33'18 rep_tid=159 committed?=0 r=-16) 2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] merge_new_log_entries 33'18 (0'0) error 1:774f7988:ns::rbd_directory:head by client.14323.0:1 0.000000 -16 2018-10-11 13:20:51.849 7f1361792700 20 update missing, append 33'18 (0'0) error 1:774f7988:ns::rbd_directory:head by client.14323.0:1 0.000000 -16 2018-10-11 13:20:51.849 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 lua=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] append_log_entries_update_missing trim_to bool = 1 trim_to = 0'0 2018-10-11 13:20:51.849 7f1361792700 10 trim proposed trim_to = 0'0 2018-10-11 13:20:51.849 7f1361792700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 33'18, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 lua=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] op_applied version 33'18 2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] calc_trim_to limit = 33'17
... and then the base tier proceeds to process the promotion ops:
2018-10-11 13:20:51.849 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op: op osd_op(osd.0.32:113 1.6 1:774f7988:ns::rbd_directory:snapdir [list-snaps] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 2018-10-11 13:20:51.849 7f13847fd700 10 osd.0 33 OSD::ms_get_authorizer type=mon 2018-10-11 13:20:51.849 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] op_has_sufficient_caps session=0xfd26500 pool=1 (rbd ns) pool_app_metadata={} need_read_cap=1 need_write_cap=0 classes=[] -> yes 2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op osd_op(osd.0.32:113 1.6 1:774f7988:ns::rbd_directory:snapdir [list-snaps] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 may_read -> write-ordered flags ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected 2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: found obc in cache: 0x1000adc0 2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: 0x1000adc0 1:774f7988:ns::rbd_directory:head rwstate(none n=0 w=0) oi: 1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) exists: 1 ssc: 0xfcde160 snapset: 0=[]:{} 2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] find_object_context 1:774f7988:ns::rbd_directory:head @head oi=1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) 2018-10-11 13:20:51.849 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op obc obc(1:774f7988:ns::rbd_directory:head rwstate(excl n=1 w=0)) 2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 0xfc39800 2018-10-11 13:20:51.849 7f1375873700 10 osd.0 33 ms_handle_connect con 0x100f4000 2018-10-11 13:20:51.850 7f1375873700 10 osd.0 33 ms_handle_connect on mon 2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 1:774f7988:ns::rbd_directory:head [list-snaps] ov 33'17 2018-10-11 13:20:51.850 7f1375873700 10 osd.0 33 send_alive up_thru currently 32 want 32 2018-10-11 13:20:51.850 7f1375873700 10 osd.0 33 requeue_pg_temp 0 + 0 -> 0 2018-10-11 13:20:51.850 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [list-snaps] 2018-10-11 13:20:51.850 7f1375873700 20 osd.0 33 _send_ready_to_merge ready_to_merge_source not_ready_to_merge_source ready_to_merge_target {} not_ready_to_merge_target {} sent_ready_to_merge_source 2018-10-11 13:20:51.850 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op list-snaps 2018-10-11 13:20:51.850 7f1375873700 10 osd.0 33 requeue_failures 0 + 0 -> 0 2018-10-11 13:20:51.850 7f1375873700 20 osd.0 33 send_beacon sending 2018-10-11 13:20:51.850 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] snapset 0=[]:{} 2018-10-11 13:20:51.850 7f136579a700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op_effects osd.0 con 0xf8c6e00 2018-10-11 13:20:51.850 7f136579a700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] log_op_stats osd_op(osd.0.32:113 1.6 1:774f7988:ns::rbd_directory:snapdir [list-snaps] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 inb 0 outb 48 lat 152.356508 2018-10-11 13:20:51.850 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] publish_stats_to_osd reporting purged_snaps [] 2018-10-11 13:20:51.850 7f136579a700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] publish_stats_to_osd 33:92 2018-10-11 13:20:51.850 7f136579a700 10 osd.0 33 dequeue_op 0xfc90820 finish 2018-10-11 13:20:51.850 7f136579a700 1 heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7f136579a700' had timed out after 60 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 op_wq(6) _process 1.6 to_process <OpQueueItem(1.6 PGOpItem(op=osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8) prio 63 cost 29 e33)> waiting <> waiting_peering {} 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 op_wq(6) _process OpQueueItem(1.6 PGOpItem(op=osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8) prio 63 cost 29 e33) pg 0xfa28000 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 33 dequeue_op 0xfc91520 prio 63 cost 29 latency 152.356517 osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 pg pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 33 share_map osd.0 192.168.1.192:52558/29482 33 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 33 should_share_map osd.0 192.168.1.192:52558/29482 33 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] _handle_message: 0xfc91520 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op: op osd_op(osd.0.32:114 1.6 1:774f7988:ns::rbd_directory:head [copy-get max 524288] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] op_has_sufficient_caps session=0xfd26500 pool=1 (rbd ns) pool_app_metadata={} need_read_cap=1 need_write_cap=0 classes=[] -> yes 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op osd_op(osd.0.32:114 1.6 1:774f7988:ns::rbd_directory:head [copy-get max 524288] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 may_read -> write-ordered flags ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: found obc in cache: 0x1000adc0 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: 0x1000adc0 1:774f7988:ns::rbd_directory:head rwstate(none n=0 w=0) oi: 1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) exists: 1 ssc: 0xfcde160 snapset: 0=[]:{} 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] find_object_context 1:774f7988:ns::rbd_directory:head @head oi=1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op obc obc(1:774f7988:ns::rbd_directory:head rwstate(excl n=1 w=0)) 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 0x100d6000 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 1:774f7988:ns::rbd_directory:head [copy-get max 524288] ov 33'17 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [copy-get max 524288] 2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op copy-get max 524288 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] got attrs 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] got data 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] got omap 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] got reqids 2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] cursor.is_complete=1 0 attrs 0 bytes 0 omap header bytes 82 omap data bytes in 3 keys 10 reqids 2018-10-11 13:20:51.851 7f1361792700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op_effects osd.0 con 0xf8c6e00 2018-10-11 13:20:51.851 7f1361792700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] log_op_stats osd_op(osd.0.32:114 1.6 1:774f7988:ns::rbd_directory:head [copy-get max 524288] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 inb 0 outb 535 lat 152.357216 2018-10-11 13:20:51.851 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] publish_stats_to_osd reporting purged_snaps [] 2018-10-11 13:20:51.851 7f1361792700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] publish_stats_to_osd 33:93
... the cache tier starts to process the promotion. Note that the still in-flight client op (client.14323.0:1) is in the reqids response w/ a version of 0 and w/o the -16 error code:
2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] process_copy_chunk 2:774f7988:ns::rbd_directory:head tid 114 (0) Success 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] process_copy_chunk success; committing 2018-10-11 13:20:51.851 7f136579a700 20 osd.0 op_wq(6) _process empty q, waiting 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] process_copy_chunk complete r = (0) Success 2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_promote 2:774f7988:ns::rbd_directory:head r=0 uv48 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] simple_opc_create 2:774f7988:ns::rbd_directory:head 2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_promote setting omap flag on 2:774f7988:ns::rbd_directory:head 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] fill_in_final_tx: writing directly to final object 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] _write_copy_chunk 0xfc4be10 0 attrs 0 bytes 0 omap header bytes 82 omap data bytes 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_promote new_snapset 0=[]:{} 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_promote took lock on obc, rwstate(write n=1 w=0) 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_ctx 2:774f7988:ns::rbd_directory:head 0x100c3000 op promote 2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] set mtime to 2018-10-11 13:20:51.853152 2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] final snapset 0=[]:{} in 2:774f7988:ns::rbd_directory:head 2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_ctx extra_reqids [client.14323.0:1,0,osd.0.32:110,48,client.14315.0:2,48,client.14310.0:1,0,osd.0.32:105,41,client.14302.0:2,41,client.14302.0:1,0,client.14300.0:2,39,client.14300.0:1,0,client.14296.0:2,36]
... and then once it kicks the pending proxied writes to apply locally, it incorrectly determines that the original request was already completed with a result of 0:
2018-10-11 13:20:51.852 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] cancel_proxy_write 2:774f7988:ns::rbd_directory:head 2018-10-11 13:20:51.852 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] kick_proxy_ops_blocked 2:774f7988:ns::rbd_directory:head requeuing 1 requests 2018-10-11 13:20:51.852 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] requeue_op 0xfc90d00 2018-10-11 13:20:51.852 7f136a7a4700 20 osd.0 op_wq(2) _enqueue_front OpQueueItem(2.2 PGOpItem(op=osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8) prio 63 cost 17 e33) 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 op_wq(2) _process 2.2 to_process <> waiting <> waiting_peering {} 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 op_wq(2) _process OpQueueItem(2.2 PGOpItem(op=osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8) prio 63 cost 17 e33) queued 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 op_wq(2) _process 2.2 to_process <OpQueueItem(2.2 PGOpItem(op=osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8) prio 63 cost 17 e33)> waiting <> waiting_peering {} 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 op_wq(2) _process OpQueueItem(2.2 PGOpItem(op=osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8) prio 63 cost 17 e33) pg 0xf8ba800 2018-10-11 13:20:51.852 7f136779e700 10 osd.0 33 dequeue_op 0xfc90d00 prio 63 cost 17 latency 152.360047 osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 pg pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] 2018-10-11 13:20:51.852 7f136779e700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] _handle_message: 0xfc90d00 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] do_op: op osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] op_has_sufficient_caps session=0xfd26280 pool=2 (cache ns) pool_app_metadata={} need_read_cap=0 need_write_cap=1 classes=[class rbd method dir_state_set rd 1 wr 1 wl 1] -> yes 2018-10-11 13:20:51.852 7f136779e700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] do_op osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 may_write may_read -> write-ordered flags ondisk+write+known_if_redirected 2018-10-11 13:20:51.852 7f136779e700 3 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] do_op dup client.14323.0:1 version 33'51 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] already_complete: 33'51 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] already_complete: repgather(0xfbc8c00 33'51 rep_tid=160 committed?=0 r=0) 2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] already_complete: repgather(0xfbc8c00 33'51 rep_tid=160 committed?=0 r=0) not committed, returning false 2018-10-11 13:20:51.852 7f136779e700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] waiting for 33'51 to commit
Updated by Jason Dillaman over 5 years ago
- Status changed from New to In Progress
- Assignee set to Jason Dillaman
Updated by Jason Dillaman over 5 years ago
- Status changed from In Progress to Fix Under Review
Updated by Sage Weil over 5 years ago
- Status changed from Fix Under Review to Resolved