Project

General

Profile

Actions

Bug #36408

closed

[cache tier] failed guarded write + promotion results in "success" op result

Added by Jason Dillaman over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Jason Dillaman
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Simple reproducer:

ceph osd pool create cache 4
ceph osd tier add rbd cache
ceph osd tier cache-mode cache writeback
ceph osd tier set-overlay rbd cache
ceph osd pool set cache hit_set_type bloom
ceph osd pool set cache hit_set_count 8
ceph osd pool set cache hit_set_period 60
ceph osd pool set cache target_max_objects 250
rbd namespace create rbd ns  
rbd create rbd/ns/test -s 1G 
rbd namespace remove rbd ns 
rbd: namespace contains images which must be deleted first.
rados -p cache -N ns cache-flush rbd_directory 
rados -p cache -N ns cache-evict rbd_directory
rbd namespace remove rbd ns && echo FAILURE

If an object doesn't exist on the cache tier but a guarded write operation will cause it to be proxied and promoted, there is a chance that the proxy-write operation will be canceled and restarted when the promotion completes. However, when the original op is restarted against the now promoted object, the dup-op detection incorrectly compares the object against object version zero since the copy-from reqids do not include the result code (only requester and version):

In the logs below, a guarded write from the client kicks off a proxied write and promotion:

2018-10-11 13:18:19.493 7f1363796700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] do_op: op osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8
2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] do_op osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 may_write may_read -> write-ordered flags ondisk+write+known_if_redirected
2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] get_object_context: found obc in cache: 0xfbe1080
2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] get_object_context: 0xfbe1080 2:774f7988:ns::rbd_directory:head rwstate(none n=0 w=0) oi: 2:774f7988:ns::rbd_directory:head(0'0 unknown.0.0:0 s 0 uv 0 alloc_hint [0 0 0]) exists: 0 ssc: 0x10070f20 snapset: 0=[]:{}
2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] find_object_context 2:774f7988:ns::rbd_directory:head @head oi=2:774f7988:ns::rbd_directory:head(0'0 unknown.0.0:0 s 0 uv 0 alloc_hint [0 0 0])
2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] agent_choose_mode flush_mode: idle evict_mode: idle num_objects: 9 num_bytes: 678 num_objects_dirty: 0 num_objects_omap: 1 num_dirty: 0 num_user_objects: 1 num_user_bytes: 300 num_overhead_bytes: 300 pool.info.target_max_bytes: 0 pool.info.target_max_objects: 250
2018-10-11 13:18:19.494 7f1363796700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] agent_choose_mode dirty 0 full 0.016129
2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] do_proxy_write Start proxy write for osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8
2018-10-11 13:18:19.494 7f1363796700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] maybe_promote missing_oid 2:774f7988:ns::rbd_directory:head  in_hit_set 1
2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] start_copy 2:774f7988:ns::rbd_directory:head from 2:774f7988:ns::rbd_directory:head @1;ns v0 flags 30 mirror_snapset
2018-10-11 13:18:19.494 7f1363796700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] _copy_some obc(2:774f7988:ns::rbd_directory:head(dne) rwstate(none n=0 w=0)) 0xfc4be10

The base tier receives the proxied guarded write:

2018-10-11 13:18:19.494 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op: op osd_op(client.14323.0:1 1.6 1:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+ignore_cache+ignore_overlay+known_if_redirected e33) v8
2018-10-11 13:18:19.494 7f1361792700 10 osd.0 33 class rbd method dir_state_set flags=rw
2018-10-11 13:18:19.494 7f136579a700 20 osd.0 op_wq(6) _process 1.6 to_process <> waiting <> waiting_peering {}
2018-10-11 13:18:19.494 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] op_has_sufficient_caps session=0xfd26500 pool=1 (rbd ns) pool_app_metadata={} need_read_cap=0 need_write_cap=1 classes=[class rbd method dir_state_set rd 1 wr 1 wl 1] -> yes
2018-10-11 13:18:19.494 7f136579a700 20 osd.0 op_wq(6) _process OpQueueItem(1.6 PGOpItem(op=osd_op(osd.0.32:113 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8) prio 63 cost 0 e33) queued
2018-10-11 13:18:19.494 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op osd_op(client.14323.0:1 1.6 1:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+ignore_cache+ignore_overlay+known_if_redirected e33) v8 may_write may_read -> write-ordered flags ondisk+write+ignore_cache+ignore_overlay+known_if_redirected
2018-10-11 13:18:19.494 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: found obc in cache: 0x1000adc0
2018-10-11 13:18:19.494 7f1386000700 15 osd.0 33 enqueue_op 0xfc91520 prio 63 cost 29 latency 0.000111 epoch 33 osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8
2018-10-11 13:18:19.494 7f1386000700 20 osd.0 op_wq(6) _enqueue OpQueueItem(1.6 PGOpItem(op=osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8) prio 63 cost 29 e33)
2018-10-11 13:18:19.494 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: 0x1000adc0 1:774f7988:ns::rbd_directory:head rwstate(none n=0 w=0) oi: 1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) exists: 1 ssc: 0xfcde160 snapset: 0=[]:{}
2018-10-11 13:18:19.494 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] find_object_context 1:774f7988:ns::rbd_directory:head @head oi=1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0])
2018-10-11 13:18:19.494 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op obc obc(1:774f7988:ns::rbd_directory:head rwstate(excl n=1 w=0))
2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 0x100d6000
2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 1:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] ov 33'17 av 33'18 snapc 0=[] snapset 0=[]:{}
2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete]
2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op  call rbd.dir_state_set
2018-10-11 13:18:19.495 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] call method rbd.dir_state_set

While the guarded write is in-flight, the base tier starts to process the promotion ops:

2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [stat]
2018-10-11 13:20:51.848 7f13847fd700 10 osd.0 33 ping from client.14323
2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op  stat
2018-10-11 13:20:51.848 7f13847fd700 10 osd.0 33 ping from client.14323
2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] stat oi has 0 2018-10-11 13:18:01.980537
2018-10-11 13:20:51.848 7f13847fd700 10 osd.0 33 ping from client.14323
2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [omap-get-vals]
2018-10-11 13:20:51.848 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op  omap-get-vals

... then the guarded write fails:

2018-10-11 13:20:51.849 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] Found key name_test
2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] method called response length=0
2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] calc_trim_to limit = 33'17
2018-10-11 13:20:51.849 7f1386000700 10 osd.0 33 ping from osd.0
2018-10-11 13:20:51.849 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx update_log_only -- result=-16
2018-10-11 13:20:51.849 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] record_write_error r=-16
2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] submit_log_entries 33'18 (0'0) error    1:774f7988:ns::rbd_directory:head by client.14323.0:1 0.000000 -16
2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] new_repop: repgather(0xfbc8600 33'18 rep_tid=159 committed?=0 r=-16)
2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'17 (0'0,33'17] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] merge_new_log_entries 33'18 (0'0) error    1:774f7988:ns::rbd_directory:head by client.14323.0:1 0.000000 -16
2018-10-11 13:20:51.849 7f1361792700 20 update missing, append 33'18 (0'0) error    1:774f7988:ns::rbd_directory:head by client.14323.0:1 0.000000 -16
2018-10-11 13:20:51.849 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 lua=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] append_log_entries_update_missing trim_to bool = 1 trim_to = 0'0
2018-10-11 13:20:51.849 7f1361792700 10 trim proposed trim_to = 0'0
2018-10-11 13:20:51.849 7f1361792700  5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 33'18, trimmed: , trimmed_dups: , clear_divergent_priors: 0
2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 lua=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] op_applied version 33'18
2018-10-11 13:20:51.849 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] calc_trim_to limit = 33'17

... and then the base tier proceeds to process the promotion ops:


2018-10-11 13:20:51.849 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op: op osd_op(osd.0.32:113 1.6 1:774f7988:ns::rbd_directory:snapdir [list-snaps] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8
2018-10-11 13:20:51.849 7f13847fd700 10 osd.0 33 OSD::ms_get_authorizer type=mon
2018-10-11 13:20:51.849 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] op_has_sufficient_caps session=0xfd26500 pool=1 (rbd ns) pool_app_metadata={} need_read_cap=1 need_write_cap=0 classes=[] -> yes
2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op osd_op(osd.0.32:113 1.6 1:774f7988:ns::rbd_directory:snapdir [list-snaps] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 may_read -> write-ordered flags ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected
2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: found obc in cache: 0x1000adc0
2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: 0x1000adc0 1:774f7988:ns::rbd_directory:head rwstate(none n=0 w=0) oi: 1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) exists: 1 ssc: 0xfcde160 snapset: 0=[]:{}
2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] find_object_context 1:774f7988:ns::rbd_directory:head @head oi=1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0])
2018-10-11 13:20:51.849 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op obc obc(1:774f7988:ns::rbd_directory:head rwstate(excl n=1 w=0))
2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 0xfc39800
2018-10-11 13:20:51.849 7f1375873700 10 osd.0 33 ms_handle_connect con 0x100f4000
2018-10-11 13:20:51.850 7f1375873700 10 osd.0 33 ms_handle_connect on mon
2018-10-11 13:20:51.849 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 1:774f7988:ns::rbd_directory:head [list-snaps] ov 33'17
2018-10-11 13:20:51.850 7f1375873700 10 osd.0 33 send_alive up_thru currently 32 want 32
2018-10-11 13:20:51.850 7f1375873700 10 osd.0 33 requeue_pg_temp 0 + 0 -> 0
2018-10-11 13:20:51.850 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [list-snaps]
2018-10-11 13:20:51.850 7f1375873700 20 osd.0 33 _send_ready_to_merge ready_to_merge_source  not_ready_to_merge_source  ready_to_merge_target {} not_ready_to_merge_target {} sent_ready_to_merge_source
2018-10-11 13:20:51.850 7f136579a700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op  list-snaps
2018-10-11 13:20:51.850 7f1375873700 10 osd.0 33 requeue_failures 0 + 0 -> 0
2018-10-11 13:20:51.850 7f1375873700 20 osd.0 33 send_beacon sending
2018-10-11 13:20:51.850 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean]  snapset 0=[]:{}
2018-10-11 13:20:51.850 7f136579a700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op_effects osd.0 con 0xf8c6e00
2018-10-11 13:20:51.850 7f136579a700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] log_op_stats osd_op(osd.0.32:113 1.6 1:774f7988:ns::rbd_directory:snapdir [list-snaps] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 inb 0 outb 48 lat 152.356508
2018-10-11 13:20:51.850 7f136579a700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] publish_stats_to_osd reporting purged_snaps []
2018-10-11 13:20:51.850 7f136579a700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] publish_stats_to_osd 33:92
2018-10-11 13:20:51.850 7f136579a700 10 osd.0 33 dequeue_op 0xfc90820 finish
2018-10-11 13:20:51.850 7f136579a700  1 heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7f136579a700' had timed out after 60
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 op_wq(6) _process 1.6 to_process <OpQueueItem(1.6 PGOpItem(op=osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8) prio 63 cost 29 e33)> waiting <> waiting_peering {}
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 op_wq(6) _process OpQueueItem(1.6 PGOpItem(op=osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8) prio 63 cost 29 e33) pg 0xfa28000
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 33 dequeue_op 0xfc91520 prio 63 cost 29 latency 152.356517 osd_op(osd.0.32:114 1.6 1.119ef2ee (undecoded) ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 pg pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean]
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 33 share_map osd.0 192.168.1.192:52558/29482 33
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 33 should_share_map osd.0 192.168.1.192:52558/29482 33
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] _handle_message: 0xfc91520
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op: op osd_op(osd.0.32:114 1.6 1:774f7988:ns::rbd_directory:head [copy-get max 524288] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] op_has_sufficient_caps session=0xfd26500 pool=1 (rbd ns) pool_app_metadata={} need_read_cap=1 need_write_cap=0 classes=[] -> yes
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op osd_op(osd.0.32:114 1.6 1:774f7988:ns::rbd_directory:head [copy-get max 524288] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 may_read -> write-ordered flags ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: found obc in cache: 0x1000adc0
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] get_object_context: 0x1000adc0 1:774f7988:ns::rbd_directory:head rwstate(none n=0 w=0) oi: 1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0]) exists: 1 ssc: 0xfcde160 snapset: 0=[]:{}
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] find_object_context 1:774f7988:ns::rbd_directory:head @head oi=1:774f7988:ns::rbd_directory:head(33'17 osd.0.32:110 dirty|omap s 0 uv 48 alloc_hint [0 0 0])
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_op obc obc(1:774f7988:ns::rbd_directory:head rwstate(excl n=1 w=0))
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 0x100d6000
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] execute_ctx 1:774f7988:ns::rbd_directory:head [copy-get max 524288] ov 33'17
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op 1:774f7988:ns::rbd_directory:head [copy-get max 524288]
2018-10-11 13:20:51.850 7f1361792700 10 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op  copy-get max 524288
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean]  got attrs
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean]  got data
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean]  got omap
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean]  got reqids
2018-10-11 13:20:51.850 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean]  cursor.is_complete=1 0 attrs 0 bytes 0 omap header bytes 82 omap data bytes in 3 keys 10 reqids
2018-10-11 13:20:51.851 7f1361792700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] do_osd_op_effects osd.0 con 0xf8c6e00
2018-10-11 13:20:51.851 7f1361792700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] log_op_stats osd_op(osd.0.32:114 1.6 1:774f7988:ns::rbd_directory:head [copy-get max 524288] snapc 0=[] ondisk+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e33) v8 inb 0 outb 535 lat 152.357216
2018-10-11 13:20:51.851 7f1361792700 20 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] publish_stats_to_osd reporting purged_snaps []
2018-10-11 13:20:51.851 7f1361792700 15 osd.0 pg_epoch: 33 pg[1.6( v 33'18 (0'0,33'18] local-lis/les=32/33 n=2 ec=6/6 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'17 crt=33'17 lcod 33'16 mlcod 33'16 active+clean] publish_stats_to_osd 33:93

... the cache tier starts to process the promotion. Note that the still in-flight client op (client.14323.0:1) is in the reqids response w/ a version of 0 and w/o the -16 error code:

2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] process_copy_chunk 2:774f7988:ns::rbd_directory:head tid 114 (0) Success
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] process_copy_chunk success; committing
2018-10-11 13:20:51.851 7f136579a700 20 osd.0 op_wq(6) _process empty q, waiting
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] process_copy_chunk complete r = (0) Success
2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_promote 2:774f7988:ns::rbd_directory:head r=0 uv48
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] simple_opc_create 2:774f7988:ns::rbd_directory:head
2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_promote setting omap flag on 2:774f7988:ns::rbd_directory:head
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] fill_in_final_tx: writing directly to final object
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] _write_copy_chunk 0xfc4be10 0 attrs 0 bytes 0 omap header bytes 82 omap data bytes
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_promote new_snapset 0=[]:{}
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_promote took lock on obc, rwstate(write n=1 w=0)
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_ctx 2:774f7988:ns::rbd_directory:head 0x100c3000 op promote
2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean]  set mtime to 2018-10-11 13:20:51.853152
2018-10-11 13:20:51.851 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean]  final snapset 0=[]:{} in 2:774f7988:ns::rbd_directory:head
2018-10-11 13:20:51.851 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'50 (0'0,33'50] local-lis/les=32/33 n=9 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 crt=33'50 lcod 33'49 mlcod 33'49 active+clean] finish_ctx  extra_reqids [client.14323.0:1,0,osd.0.32:110,48,client.14315.0:2,48,client.14310.0:1,0,osd.0.32:105,41,client.14302.0:2,41,client.14302.0:1,0,client.14300.0:2,39,client.14300.0:1,0,client.14296.0:2,36]

... and then once it kicks the pending proxied writes to apply locally, it incorrectly determines that the original request was already completed with a result of 0:

2018-10-11 13:20:51.852 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] cancel_proxy_write 2:774f7988:ns::rbd_directory:head
2018-10-11 13:20:51.852 7f136a7a4700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] kick_proxy_ops_blocked 2:774f7988:ns::rbd_directory:head requeuing 1 requests
2018-10-11 13:20:51.852 7f136a7a4700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] requeue_op 0xfc90d00
2018-10-11 13:20:51.852 7f136a7a4700 20 osd.0 op_wq(2) _enqueue_front OpQueueItem(2.2 PGOpItem(op=osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8) prio 63 cost 17 e33)
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 op_wq(2) _process 2.2 to_process <> waiting <> waiting_peering {}
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 op_wq(2) _process OpQueueItem(2.2 PGOpItem(op=osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8) prio 63 cost 17 e33) queued
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 op_wq(2) _process 2.2 to_process <OpQueueItem(2.2 PGOpItem(op=osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8) prio 63 cost 17 e33)> waiting <> waiting_peering {}
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 op_wq(2) _process OpQueueItem(2.2 PGOpItem(op=osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8) prio 63 cost 17 e33) pg 0xf8ba800
2018-10-11 13:20:51.852 7f136779e700 10 osd.0 33 dequeue_op 0xfc90d00 prio 63 cost 17 latency 152.360047 osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 pg pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean]
2018-10-11 13:20:51.852 7f136779e700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] _handle_message: 0xfc90d00
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] do_op: op osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] op_has_sufficient_caps session=0xfd26280 pool=2 (cache ns) pool_app_metadata={} need_read_cap=0 need_write_cap=1 classes=[class rbd method dir_state_set rd 1 wr 1 wl 1] -> yes
2018-10-11 13:20:51.852 7f136779e700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] do_op osd_op(client.14323.0:1 2.2 2:774f7988:ns::rbd_directory:head [call rbd.dir_state_set,delete] snapc 0=[] ondisk+write+known_if_redirected e33) v8 may_write may_read -> write-ordered flags ondisk+write+known_if_redirected
2018-10-11 13:20:51.852 7f136779e700  3 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] do_op dup client.14323.0:1 version 33'51
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] already_complete: 33'51
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] already_complete: repgather(0xfbc8c00 33'51 rep_tid=160 committed?=0 r=0)
2018-10-11 13:20:51.852 7f136779e700 20 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean] already_complete: repgather(0xfbc8c00 33'51 rep_tid=160 committed?=0 r=0) not committed, returning false
2018-10-11 13:20:51.852 7f136779e700 10 osd.0 pg_epoch: 33 pg[2.2( v 33'51 (0'0,33'51] local-lis/les=32/33 n=10 ec=9/9 lis/c 32/32 les/c/f 33/33/0 32/32/32) [0] r=0 lpr=32 luod=33'50 crt=33'51 lcod 33'49 mlcod 33'49 active+clean]  waiting for 33'51 to commit

Actions #1

Updated by Jason Dillaman over 5 years ago

  • Description updated (diff)
Actions #2

Updated by Jason Dillaman over 5 years ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman
Actions #3

Updated by Jason Dillaman over 5 years ago

  • Status changed from In Progress to Fix Under Review
Actions #4

Updated by Sage Weil over 5 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF