Project

General

Profile

Actions

Bug #6582

closed

osd: copy_from vs read ordering broken

Added by Sage Weil over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

note uv 419 vs uv 499 on replies

remote/ubuntu@plana88.front.sepia.ceph.com/log/ceph-osd.0.log.gz:2013-10-17 14:27:15.393126 7fab6ac69700  1 -- 10.214.133.28:6800/7288 <== client.4124 10.214.133.28:0/1005194 62 ==== osd_op(client.4124.0:1076 plana885194-10 [copy-from ver 340] 3.2cbefa38 snapc 7b=[7b,7a] e159) v4 ==== 173+0+43 (1889961703 0 367423660) 0x2671480 con 0x2414000
remote/ubuntu@plana88.front.sepia.ceph.com/log/ceph-osd.0.log.gz:2013-10-17 14:27:15.393250 7fab6ac69700  1 -- 10.214.133.28:6800/7288 <== client.4124 10.214.133.28:0/1005194 63 ==== osd_op(client.4124.0:1077 plana885194-10 [stat] 3.2cbefa38 rwordered e159) v4 ==== 157+0+0 (3052344142 0 0) 0x2ffc900 con 0x2414000
remote/ubuntu@plana88.front.sepia.ceph.com/log/ceph-osd.0.log.gz:2013-10-17 14:27:43.902934 7fab6445c700  1 -- 10.214.133.28:6800/7288 --> 10.214.133.28:0/1005194 -- osd_op_reply(1077 plana885194-10 [stat] v0'0 uv419 ondisk = 0) v6 -- ?+0 0x2271a00 con 0x2414000
remote/ubuntu@plana64.front.sepia.ceph.com/log/ceph-osd.5.log.gz:2013-10-17 14:28:23.405323 7fd564bfd700  1 -- 10.214.132.14:6804/30090 <== client.4124 10.214.133.28:0/1005194 2 ==== osd_op(client.4124.0:1076 plana885194-10 [copy-from ver 340] 3.2cbefa38 RETRY=1 snapc 7b=[7b,7a] e209) v4 ==== 173+0+43 (1499181635 0 367423660) 0x248d6c0 con 0x255b840
remote/ubuntu@plana64.front.sepia.ceph.com/log/ceph-osd.5.log.gz:2013-10-17 14:28:25.635864 7fd55e3f0700  1 -- 10.214.132.14:6804/30090 --> 10.214.133.28:0/1005194 -- osd_op_reply(1076 plana885194-10 [copy-from ver 340] v211'499 uv499 ack = 0) v6 -- ?+0 0x242dc80 con 0x255b840
remote/ubuntu@plana64.front.sepia.ceph.com/log/ceph-osd.5.log.gz:2013-10-17 14:28:25.708504 7fd569326700  1 -- 10.214.132.14:6804/30090 --> 10.214.133.28:0/1005194 -- osd_op_reply(1076 plana885194-10 [copy-from ver 340] v211'499 uv499 ondisk = 0) v6 -- ?+0 0x32ec780 con 0x255b840

ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-10-17_10:45:09-rados-master-testing-basic-plana/57378

Actions #1

Updated by Sage Weil over 10 years ago

  • Assignee set to Sage Weil
Actions #2

Updated by Greg Farnum over 10 years ago

  • Assignee changed from Sage Weil to Greg Farnum

Let's clear up the timeline a bit here. Before any of this, object plana885194-10 actually already exists. Then, on osd.0:

2013-10-17 14:26:40.763567 7fab6ac69700  1 -- 10.214.133.28:6800/7288 <== client.4124 10.214.133.28:0/1005194 45 ==== osd_op(client.4124.0:925 plana885194-10 [copy-from ver 0] 3.2cbefa38 snapc 5e=[59,54,50,4e,3e,2d] e131) v4 ==== 205+0+44 (3365177847 0 697261250) 0x2671b40 con 0x2414000
2013-10-17 14:26:40.763694 7fab6ac69700  1 -- 10.214.133.28:6800/7288 <== client.4124 10.214.133.28:0/1005194 46 ==== osd_op(client.4124.0:926 plana885194-10 [stat] 3.2cbefa38 rwordered e131) v4 ==== 157+0+0 (2952573455 0 0) 0x2671d80 con 0x2414000
....
2013-10-17 14:26:41.856830 7fab63c5b700  1 -- 10.214.133.28:6804/7288 --> 10.214.132.14:6805/26021 -- osd_op(osd.0.0:3 plana885194-35 [copy-get max 8388608 ver 0] 3.5d869047 e132) v4 -- ?+0 0x23176c0 con 0x2787160
...
2013-10-17 14:26:41.861056 7fab66460700  1 -- 10.214.133.28:6804/7288 <== osd.4 10.214.132.14:6805/26021 1 ==== osd_op_reply(3 plana885194-35 [copy-get max 8388608 ver 0] v0'0 uv0 ack = -2 (No such file or directory)) v6 ==== 173+0+0 (3011776748 0 0) 0x2623780 con 0x2787160
2013-10-17 14:26:41.861176 7fab60c55700  1 -- 10.214.133.28:6800/7288 --> 10.214.133.28:0/1005194 -- osd_op_reply(925 plana885194-10 [copy-from ver 0] v0'0 uv0 ondisk = -2 (No such file or directory)) v6 -- ?+0 0x22c7500 con 0x2414000
2013-10-17 14:26:41.861487 7fab63c5b700  1 -- 10.214.133.28:6800/7288 --> 10.214.133.28:0/1005194 -- osd_op_reply(926 plana885194-10 [stat] v0'0 uv419 ondisk = 0) v6 -- ?+0 0x2686c80 con 0x2414000

So we got a copy-from pointing at an object that doesn't exist, say so, and then return a stat with details on the object we do have.

Then, about a minute later we get a similar request in again, only this time we don't see any reply on the copy-from, and the stat does provide a reply.

2013-10-17 14:27:15.393126 7fab6ac69700  1 -- 10.214.133.28:6800/7288 <== client.4124 10.214.133.28:0/1005194 62 ==== osd_op(client.4124.0:1076 plana885194-10 [copy-from ver 340] 3.2cbefa38 snapc 7b=[7b,7a] e159) v4 ==== 173+0+43 (1889961703 0 367423660) 0x2671480 con 0x2414000
2013-10-17 14:27:15.393250 7fab6ac69700  1 -- 10.214.133.28:6800/7288 <== client.4124 10.214.133.28:0/1005194 63 ==== osd_op(client.4124.0:1077 plana885194-10 [stat] 3.2cbefa38 rwordered e159) v4 ==== 157+0+0 (3052344142 0 0) 0x2ffc900 con 0x2414000
2013-10-17 14:27:43.902934 7fab6445c700  1 -- 10.214.133.28:6800/7288 --> 10.214.133.28:0/1005194 -- osd_op_reply(1077 plana885194-10 [stat] v0'0 uv419 ondisk = 0) v6 -- ?+0 0x2271a00 con 0x2414000

Notice that we still have uv419 here; this is the actual problem and is where the rwordered flag is apparently failing. Note the really long delay between receipt and handling; going through I can see that the OSD has sent off a request on plana885194-10, and retried it, but hasn't gotten any response. The stat response happens as the OSD is going through a bunch of peering — I have to assume that a bunch of mappings just changed and the OSD cancelled the copy-from because the acting set changed.

Yep! on_change() we simply cancel copy ops, but if the primary has not changed we will let other stuff through.

Actions #3

Updated by Greg Farnum over 10 years ago

And that's all on me; I ripped out Sage's mechanism for it when removing some other extraneous code. :(

Actions #4

Updated by Greg Farnum over 10 years ago

  • Status changed from New to Fix Under Review

Pushed wip-6582 which should correct this. It passes my basic local copy-from test but I'm waiting for it to build so I can do some teuthology tests on it.

Actions #5

Updated by Greg Farnum over 10 years ago

wip-6582 seems to be looping successfully without hitting this issue any more.

Actions #6

Updated by Sage Weil over 10 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF