Project

General

Profile

Bug #11057

osd: out of order write reply on hammer rc

Added by Sage Weil about 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2015-03-06T14:40:01.693 INFO:tasks.rados.rados.0.burnupi09.stderr:Error: finished tid 1 when last_acked_tid was 6
2015-03-06T14:40:01.694 INFO:tasks.rados.rados.0.burnupi09.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f5e9effd700 time 2015-03-06 14:40:01.693751
2015-03-06T14:40:01.694 INFO:tasks.rados.rados.0.burnupi09.stderr:./test/osd/RadosModel.h: 847: FAILED assert(0)
2015-03-06T14:40:01.694 INFO:tasks.rados.rados.0.burnupi09.stderr: ceph version 0.93-46-gdffa05f (dffa05fc4d749ec731d1500e3d3a993b3c17e449)
2015-03-06T14:40:01.694 INFO:tasks.rados.rados.0.burnupi09.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x4def6b]
2015-03-06T14:40:01.695 INFO:tasks.rados.rados.0.burnupi09.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x46e) [0x4c70ae]
2015-03-06T14:40:01.695 INFO:tasks.rados.rados.0.burnupi09.stderr: 3: (write_callback(void*, void*)+0x19) [0x4d6639]
2015-03-06T14:40:01.696 INFO:tasks.rados.rados.0.burnupi09.stderr: 4: (()+0xb3c5d) [0x7f5eac915c5d]
2015-03-06T14:40:01.696 INFO:tasks.rados.rados.0.burnupi09.stderr: 5: (()+0x8f9c9) [0x7f5eac8f19c9]
2015-03-06T14:40:01.696 INFO:tasks.rados.rados.0.burnupi09.stderr: 6: (()+0x1529d8) [0x7f5eac9b49d8]
2015-03-06T14:40:01.697 INFO:tasks.rados.rados.0.burnupi09.stderr: 7: (()+0x8182) [0x7f5eac436182]
2015-03-06T14:40:01.697 INFO:tasks.rados.rados.0.burnupi09.stderr: 8: (clone()+0x6d) [0x7f5eaadbc38d]

ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2015-03-06_11:33:19-rados-wip-sage-testing-distro-basic-multi/793470

History

#1 Updated by Sage Weil about 9 years ago

also ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2015-03-06_11:33:19-rados-wip-sage-testing-distro-basic-multi/793454

#2 Updated by Sage Weil about 9 years ago

remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:00.992916 7fb7b59c7700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 592 ==== osd_op(client.4154.0:3284 plana5710708-655 [write 431932~729095] 2.9f6db71e ack+ondisk+write+known_if_redirected e40) v5 ==== 175+0+729095 (740002325 0 2989391686) 0x416aa80 con 0x56abc80
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:00.993746 7fb7d0530700  1 -- 10.214.131.4:0/6520 --> 10.214.132.21:6804/9024 -- osd_op(osd.5.3:1119 plana5710708-655@snapdir [list-snaps] 1.9f6db71e ack+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e40) v5 -- ?+0 0x4e54b00 con 0x65be1a0
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:00.993799 7fb7d0530700  1 -- 10.214.131.4:0/6520 --> 10.214.132.21:6804/9024 -- osd_op(osd.5.3:1120 plana5710708-655 [copy-get max 8388608] 1.9f6db71e ack+read+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e40) v5 -- ?+0 0x7efcb00 con 0x65be1a0
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:00.995880 7fb7b75ce700  1 -- 10.214.131.4:0/6520 <== osd.0 10.214.132.21:6804/9024 65 ==== osd_op_reply(1119 plana5710708-655 [list-snaps] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6 ==== 183+0+0 (3749150846 0 0) 0x7eaaec0 con 0x65be1a0
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:00.996329 7fb7b75ce700  1 -- 10.214.131.4:0/6520 <== osd.0 10.214.132.21:6804/9024 66 ==== osd_op_reply(1120 plana5710708-655 [copy-get max 8388608] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6 ==== 183+0+0 (2591800368 0 0) 0x6bce5c0 con 0x65be1a0
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:01.019621 7fb7b59c7700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 593 ==== osd_op(client.4154.0:3285 plana5710708-655 [write 1886960~504945] 2.9f6db71e ack+ondisk+write+known_if_redirected e40) v5 ==== 175+0+504945 (4153636227 0 62118341) 0x4eed000 con 0x56abc80
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:01.019840 7fb7b59c7700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 594 ==== osd_op(client.4154.0:3286 plana5710708-655 [write 2431931~1] 2.9f6db71e ack+ondisk+write+known_if_redirected e40) v5 ==== 175+0+1 (3012948971 0 3570033899) 0x4ee9b80 con 0x56abc80
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:01.020038 7fb7b59c7700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 595 ==== osd_op(client.4154.0:3287 plana5710708-655 [setxattr _header (55),truncate 2431932] 2.9f6db71e ack+ondisk+write+known_if_redirected e40) v5 ==== 213+0+62 (1888623161 0 2691797762) 0x4eead00 con 0x56abc80
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:01.020230 7fb7b59c7700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 596 ==== osd_op(client.4154.0:3288 plana5710708-655 [read 0~1] 2.9f6db71e ack+read+rwordered+known_if_redirected e40) v5 ==== 175+0+0 (1677540283 0 0) 0x4eea580 con 0x56abc80
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:03.728747 7fb7beaef700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 1 ==== osd_op(client.4154.0:3284 plana5710708-655 [write 431932~729095] 2.9f6db71e RETRY=1 ack+ondisk+retry+write+known_if_redirected e42) v5 ==== 175+0+729095 (440595024 0 2989391686) 0x7ef8f00 con 0x56a9600
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:03.733000 7fb7beaef700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 2 ==== osd_op(client.4154.0:3285 plana5710708-655 [write 1886960~504945] 2.9f6db71e RETRY=1 ack+ondisk+retry+write+known_if_redirected e42) v5 ==== 175+0+504945 (3251244486 0 62118341) 0x7ef8780 con 0x56a9600
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:03.733275 7fb7beaef700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 3 ==== osd_op(client.4154.0:3286 plana5710708-655 [write 2431931~1] 2.9f6db71e RETRY=1 ack+ondisk+retry+write+known_if_redirected e42) v5 ==== 175+0+1 (2244774830 0 3570033899) 0x7ef9180 con 0x56a9600
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:03.733417 7fb7cdd2b700  1 -- 10.214.131.4:6808/6520 --> 10.214.132.21:0/1010708 -- osd_op_reply(3285 plana5710708-655 [write 1886960~504945] v40'763 uv763 ondisk = 0) v6 -- ?+0 0x4ff3f40 con 0x56a9600
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:03.733560 7fb7beaef700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 4 ==== osd_op(client.4154.0:3287 plana5710708-655 [setxattr _header (55),truncate 2431932] 2.9f6db71e RETRY=1 ack+ondisk+retry+write+known_if_redirected e42) v5 ==== 213+0+62 (1376648407 0 2691797762) 0x7eff800 con 0x56a9600
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:03.733728 7fb7cdd2b700  1 -- 10.214.131.4:6808/6520 --> 10.214.132.21:0/1010708 -- osd_op_reply(3286 plana5710708-655 [write 2431931~1] v40'764 uv764 ondisk = 0) v6 -- ?+0 0x558b180 con 0x56a9600
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:03.733859 7fb7beaef700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 5 ==== osd_op(client.4154.0:3288 plana5710708-655 [read 0~1] 2.9f6db71e RETRY=1 ack+retry+read+rwordered+known_if_redirected e42) v5 ==== 175+0+0 (1436828670 0 0) 0x7efad00 con 0x56a9600
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:03.734011 7fb7d0530700  1 -- 10.214.131.4:6808/6520 --> 10.214.132.21:0/1010708 -- osd_op_reply(3287 plana5710708-655 [setxattr _header (55),truncate 2431932] v40'765 uv765 ondisk = 0) v6 -- ?+0 0x65dcd00 con 0x56a9600
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:04.000616 7fb7beaef700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 1 ==== osd_op(client.4154.0:3284 plana5710708-655 [write 431932~729095] 2.9f6db71e RETRY=2 ack+ondisk+retry+write+known_if_redirected e42) v5 ==== 175+0+729095 (1238404608 0 2989391686) 0x4581400 con 0x40d7380
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:04.000892 7fb7beaef700  1 -- 10.214.131.4:6808/6520 <== client.4154 10.214.132.21:0/1010708 2 ==== osd_op(client.4154.0:3288 plana5710708-655 [read 0~1] 2.9f6db71e RETRY=2 ack+retry+read+rwordered+known_if_redirected e42) v5 ==== 175+0+0 (104218542 0 0) 0x4583e80 con 0x40d7380
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:04.001015 7fb7d0530700  1 -- 10.214.131.4:6808/6520 --> 10.214.132.21:0/1010708 -- osd_op_reply(3284 plana5710708-655 [write 431932~729095] v40'762 uv762 ondisk = 0) v6 -- ?+0 0x608c780 con 0x40d7380
remote/plana36/log/ceph-osd.5.log.gz:2015-03-06 16:03:04.149179 7fb7cdd2b700  1 -- 10.214.131.4:6808/6520 --> 10.214.132.21:0/1010708 -- osd_op_reply(3288 plana5710708-655 [read 0~1] v0'0 uv765 ondisk = 0) v6 -- ?+0 0x7dbd800 con 0x40d7380

#3 Updated by Sage Weil about 9 years ago

I think this is related to writeable degraded? on client.4154.0:3284 on the RETRY=1 round we see

2015-03-06 16:03:03.729241 7fb7cdd2b700  7 osd.5 pg_epoch: 42 pg[2.2( v 40'765 (30'500,40'765] local-les=42 n=151 ec=9 les/c 42/10 41/41/9) [5,4,3]/[5,0,1] r=0 lpr=41 pi=9-40/7 rops=1 bft=3,4 crt=40'746 lcod 40'746 mlcod 0'0 active+recovery_wait+degraded+remapped] degraded 9f6db71e/plana5710708-655/head//2, already recovering

then we get

2015-03-06 16:03:03.731219 7fb7bb3e2700  1 -- 10.214.131.4:6809/6520 <== osd.0 10.214.132.21:6805/9024 2045 ==== MOSDPGPushReply(2.2 42 [PushReplyOp(9f6db71e/plana5710708-655/head//2)]) v2 ==== 106+0+0 (3981500165 0 0) 0x457ea00 con 0x42e6880
2015-03-06 16:03:03.731301 7fb7bb3e2700 10 osd.5 42 handle_replica_op MOSDPGPushReply(2.2 42 [PushReplyOp(9f6db71e/plana5710708-655/head//2)]) v2 epoch 42
2015-03-06 16:03:03.731312 7fb7bb3e2700 20 osd.5 42 should_share_map osd.0 10.214.132.21:6805/9024 42
2015-03-06 16:03:03.731324 7fb7bb3e2700 15 osd.5 42 enqueue_op 0x556f300 prio 63 cost 1049576 latency 0.000232 MOSDPGPushReply(2.2 42 [PushReplyOp(9f6db71e/plana5710708-655/head//2)]) v2
2015-03-06 16:03:03.731405 7fb7d0530700 10 osd.5 42 dequeue_op 0x556f300 prio 63 cost 1049576 latency 0.000313 MOSDPGPushReply(2.2 42 [PushReplyOp(9f6db71e/plana5710708-655/head//2)]) v2 pg pg[2.2( v 40'765 (30'500,40'765] local-les=42 n=151 ec=9 les/c 42/10 41/41/9) [5,4,3]/[5,0,1] r=0 lpr=41 pi=9-40/7 rops=1 bft=3,4 crt=40'746 lcod 40'
746 mlcod 0'0 active+recovery_wait+degraded+remapped]
2015-03-06 16:03:03.731455 7fb7d0530700 10 osd.5 pg_epoch: 42 pg[2.2( v 40'765 (30'500,40'765] local-les=42 n=151 ec=9 les/c 42/10 41/41/9) [5,4,3]/[5,0,1] r=0 lpr=41 pi=9-40/7 rops=1 bft=3,4 crt=40'746 lcod 40'746 mlcod 0'0 active+recovery_wait+degraded+remapped] handle_message: 0x556f300
2015-03-06 16:03:03.731487 7fb7d0530700 20 osd.5 pg_epoch: 42 pg[2.2( v 40'765 (30'500,40'765] local-les=42 n=151 ec=9 les/c 42/10 41/41/9) [5,4,3]/[5,0,1] r=0 lpr=41 pi=9-40/7 rops=1 bft=3,4 crt=40'746 lcod 40'746 mlcod 0'0 active+recovery_wait+degraded+remapped] _update_calc_stats osd.4 misplaced 151 but partially backfilled 67
2015-03-06 16:03:03.731502 7fb7d0530700 20 osd.5 pg_epoch: 42 pg[2.2( v 40'765 (30'500,40'765] local-les=42 n=151 ec=9 les/c 42/10 41/41/9) [5,4,3]/[5,0,1] r=0 lpr=41 pi=9-40/7 rops=1 bft=3,4 crt=40'746 lcod 40'746 mlcod 0'0 active+recovery_wait+degraded+remapped] _update_calc_stats osd.3 misplaced 151 but partially backfilled 5
2015-03-06 16:03:03.731516 7fb7d0530700 15 osd.5 pg_epoch: 42 pg[2.2( v 40'765 (30'500,40'765] local-les=42 n=151 ec=9 les/c 42/10 41/41/9) [5,4,3]/[5,0,1] r=0 lpr=41 pi=9-40/7 rops=1 bft=3,4 crt=40'746 lcod 40'746 mlcod 0'0 active+recovery_wait+degraded+remapped] publish_stats_to_osd 42:327
2015-03-06 16:03:03.731547 7fb7d0530700 10 osd.5 pg_epoch: 42 pg[2.2( v 40'765 (30'500,40'765] local-les=42 n=151 ec=9 les/c 42/10 41/41/9) [5,4,3]/[5,0,1] r=0 lpr=41 pi=9-40/7 rops=1 bft=3,4 crt=40'746 lcod 40'746 mlcod 0'0 active+recovery_wait+degraded+remapped] pushed 9f6db71e/plana5710708-655/head//2, still waiting for push ack from 1 others

and then the next request comes in and gets

2015-03-06 16:03:03.733400 7fb7cdd2b700  3 osd.5 pg_epoch: 42 pg[2.2( v 40'765 (30'500,40'765] local-les=42 n=151 ec=9 les/c 42/10 41/41/9) [5,4,3]/[5,0,1] r=0 lpr=41 pi=9-40/7 rops=1 bft=3,4 crt=40'746 lcod 40'746 mlcod 0'0 active+recovery_wait+degraded+remapped] do_op dup client.4154.0:3285 was 40'763

#4 Updated by Sage Weil about 9 years ago

as the peer is recovered we go from below min_size to above min_size, which changes our decision here:

  int valid_copies = 0;
  if (write_ordered &&
      is_degraded_object(head, &valid_copies) &&
      (valid_copies < pool.info.min_size ||
       pool.info.ec_pool() ||
       !cct->_conf->osd_enable_degraded_writes ||
       !(get_min_peer_features() & CEPH_FEATURE_OSD_DEGRADED_WRITES))) {
    wait_for_degraded_object(head, op);
    return;
  }

I think the condition needs to get factored into a helper, and the ReplicatedBackend::handle_push_reply() (and maybe an equiv method on the EC side?) needs to watch for that transition and requeue_ops ...

#5 Updated by Sage Weil about 9 years ago

  • Assignee set to Samuel Just

#6 Updated by Samuel Just about 9 years ago

I think https://github.com/athanatos/ceph/tree/wip-11057 should do it, adding to wip-sam-testing.

#7 Updated by Sage Weil about 9 years ago

  • Status changed from New to 7

#8 Updated by Sage Weil about 9 years ago

  • Status changed from 7 to Resolved

Also available in: Atom PDF