Project

General

Profile

Bug #10118

messenger drops messages between osds

Added by Guang Yang over 9 years ago. Updated over 8 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Log snippets before the daemon crash:

2014-10-27 20:27:32.320777 7f2d43661700  0 osd.165 pg_epoch: 1166 pg[3.1e8ds0(
v 1165'13077 lc 1018'6596 (1004'3595,1165'13077] l
ocal-les=1150 n=13058 ec=255 les/c 1150/1020 1148/1148/1038)
[165,383,266,362,367,46,12,449,187,153,338] r=0 lpr=1148 pi=985-1147
/4 crt=1030'11674 mlcod 1018'6596 active+recovery_wait m=5070]  removing
repgather(0x27a84e40 1165'13077 rep_tid=53667 committed?
=1 applied?=1 op=osd_op(client.12473.0:48029435
default.12181.368_12227485465_50c2526085_o.jpg [create 0~0,setxattr
user.rgw.idta
g (23),writefull 0~1900011,setxattr user.rgw.manifest (472),setxattr
user.rgw.acl (133),setxattr user.rgw.content_type (25),setxa
ttr user.rgw.etag (33),setxattr user.rgw.x-amz-meta-origin (57)] 3.21a69e8d
ondisk+write e1165) v4)

2014-10-27 20:27:32.352325 7f2d43661700  0 osd.165 pg_epoch: 1166 pg[3.1e8ds0(
v 1165'13077 lc 1018'6596 (1004'3595,1165'13077] local-les=1150 n=13058 ec=255
les/c 1150/1020 1148/1148/1038) [165,383,266,362,367,46,12,449,187,153,338] r=0
lpr=1148 pi=985-1147/4 crt=1030'11674 mlcod 1018'6596 active+recovery_wait
m=5070]    q front is repgather(0x99eaaf00 1165'13076 rep_tid=53646
committed?=0 applied?=1 lock=0 op=osd_op(client.12485.0:46420099
default.12402.391_14252946456_ff2cc332ba_o.jpg [create 0~0,setxattr
user.rgw.idtag (23),writefull 0~5898240,setxattr user.rgw.manifest
(472),setxattr user.rgw.acl (133),setxattr user.rgw.content_type (25),setxattr
user.rgw.etag (33),setxattr user.rgw.x-amz-meta-origin (57)] 3.33e9be8d
ondisk+write e1165) v4)

2014-10-27 20:27:32.431388 7f2d43661700 -1 osd/ReplicatedPG.cc: In function
'void ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)' thread 7f2d43661700
time 2014-10-27 20:27:32.378207
osd/ReplicatedPG.cc: 6670: FAILED assert(repop_queue.front() == repop)

Root cause is under investigation...

Ceph version: v0.80.4
Platform: RHEL6.5


Related issues

Related to Ceph - Bug #10057: msgr: skipped message on peer reconnect Can't reproduce 11/10/2014

History

#1 Updated by Samuel Just over 9 years ago

If you can reproduce with logs, that would help. The repops are supposed to complete in strict order, this could be caused by a message ordering problem, perhaps. Is this an EC pool?

#2 Updated by Guang Yang over 9 years ago

Samuel Just wrote:

If you can reproduce with logs, that would help. The repops are supposed to complete in strict order, this could be caused by a message ordering problem, perhaps. Is this an EC pool?

Thanks Sam. So far we only saw this crash once in our cluster and it is not reproducible, sorry. It is an EC pool.

#3 Updated by Samuel Just over 9 years ago

  • Status changed from New to Need More Info

#4 Updated by Samuel Just over 9 years ago

  • Status changed from Need More Info to Can't reproduce

#5 Updated by David Zafman about 9 years ago

  • Status changed from Can't reproduce to 12

#6 Updated by David Zafman about 9 years ago

  • Assignee set to Samuel Just
  • Priority changed from Normal to Urgent

#7 Updated by Samuel Just about 9 years ago

From the logs, something happened in the messenger to change the message ordering:

osd 0:
2015-01-13 23:56:27.853186 7fbfd774d700 1 -- 10.214.131.35:6805/8353 --> 10.214.135.28:6805/29600 -- osd_repop(osd.0.0:9053 2.2 717ec72e/plana059826-5257/head//2 v 65'6939) v1 -- ?+46 0x697c680 con 0x4b3e680
2015-01-13 23:56:27.863348 7fbfddf5a700 1 -- 10.214.131.35:6805/8353 --> 10.214.135.28:6805/29600 -- osd_repop(client.4124.0:26286 2.2 717ec72e/plana059826-5257/head//2 v 65'6940) v1 -- ?+46 0x4afc580 con 0x4b3e680
2015-01-13 23:56:27.871850 7fbfddf5a700 1 -- 10.214.131.35:6805/8353 --> 10.214.135.28:6805/29600 -- osd_repop(client.4124.0:26287 2.2 717ec72e/plana059826-5257/head//2 v 65'6941) v1 -- ?+46 0x52d4680 con 0x4b3e680
2015-01-13 23:56:27.880163 7fbfddf5a700 1 -- 10.214.131.35:6805/8353 --> 10.214.135.28:6805/29600 -- osd_repop(client.4124.0:26288 2.2 717ec72e/plana059826-5257/head//2 v 65'6942) v1 -- ?+46 0x52d5700 con 0x4b3e680
2015-01-13 23:56:27.888183 7fbfdb755700 1 -- 10.214.131.35:6805/8353 --> 10.214.135.28:6805/29600 -- osd_repop(client.4124.0:26289 2.2 717ec72e/plana059826-5257/head//2 v 65'6943) v1 -- ?+46 0x561cc00 con 0x4b3e680
2015-01-13 23:56:27.896629 7fbfdb755700 1 -- 10.214.131.35:6805/8353 --> 10.214.135.28:6805/29600 -- osd_repop(client.4124.0:26290 2.2 717ec72e/plana059826-5257/head//2 v 65'6944) v1 -- ?+46 0x561b600 con 0x4b3e680
2015-01-13 23:56:28.044619 7fbfd2b24700 1 -- 10.214.131.35:6805/8353 <== osd.4 10.214.135.28:6805/29600 7614 ==== osd_repop_reply(osd.0.0:9053 2.2 ondisk, result = 0) v1 ==== 83+0+0 (3467515433 0 0) 0x7c113c0 con 0x4b3e680
2015-01-13 23:56:28.045701 7fbfd1009700 1 -- 10.214.131.35:6805/8353 <== osd.4 10.214.135.28:6805/29600 7618 ==== osd_repop_reply(client.4124.0:26288 2.2 ondisk, result = 0) v1 ==== 83+0+0 (4157578419 0 0) 0xa6ddd40 con 0x4b3e680
2015-01-13 23:56:28.045812 7fbfd1009700 1 -- 10.214.131.35:6805/8353 <== osd.4 10.214.135.28:6805/29600 7619 ==== osd_repop_reply(client.4124.0:26289 2.2 ondisk, result = 0) v1 ==== 83+0+0 (498426873 0 0) 0x56ad200 con 0x4b3e680
2015-01-13 23:56:28.045919 7fbfd1009700 1 -- 10.214.131.35:6805/8353 <== osd.4 10.214.135.28:6805/29600 7620 ==== osd_repop_reply(client.4124.0:26290 2.2 ondisk, result = 0) v1 ==== 83+0+0 (4101350087 0 0) 0x801efc0 con 0x4b3e680

osd 4:
2015-01-13 23:56:28.010639 7fa476490700 1 -- 10.214.135.28:6805/29600 <== osd.0 10.214.131.35:6805/8353 6900 ==== osd_repop(osd.0.0:9053 2.2 717ec72e/plana059826-5257/head//2 v 65'6939) v1 ==== 853+0+46 (167818985 0 3828118353) 0x5caa580 con 0x5072100
2015-01-13 23:56:28.010740 7fa476490700 1 -- 10.214.135.28:6805/29600 <== osd.0 10.214.131.35:6805/8353 6901 ==== osd_repop(client.4124.0:26286 2.2 717ec72e/plana059826-5257/head//2 v 65'6940) v1 ==== 853+0+46 (2273517897 0 3828118353) 0x5cac680 con 0x5072100
2015-01-13 23:56:28.010829 7fa476490700 1 -- 10.214.135.28:6805/29600 <== osd.0 10.214.131.35:6805/8353 6902 ==== osd_repop(client.4124.0:26287 2.2 717ec72e/plana059826-5257/head//2 v 65'6941) v1 ==== 853+0+46 (2179684516 0 3828118353) 0x5cab600 con 0x5072100
2015-01-13 23:56:28.010899 7fa476490700 1 -- 10.214.135.28:6805/29600 <== osd.0 10.214.131.35:6805/8353 6903 ==== osd_repop(client.4124.0:26288 2.2 717ec72e/plana059826-5257/head//2 v 65'6942) v1 ==== 853+0+46 (676712721 0 3828118353) 0x5cad700 con 0x5072100
2015-01-13 23:56:28.010963 7fa476490700 1 -- 10.214.135.28:6805/29600 <== osd.0 10.214.131.35:6805/8353 6904 ==== osd_repop(client.4124.0:26289 2.2 717ec72e/plana059826-5257/head//2 v 65'6943) v1 ==== 853+0+46 (1709767999 0 3828118353) 0x60be000 con 0x5072100
2015-01-13 23:56:28.011032 7fa476490700 1 -- 10.214.135.28:6805/29600 <== osd.0 10.214.131.35:6805/8353 6905 ==== osd_repop(client.4124.0:26290 2.2 717ec72e/plana059826-5257/head//2 v 65'6944) v1 ==== 853+0+46 (210697077 0 3828118353) 0x645e100 con 0x5072100
2015-01-13 23:56:28.044169 7fa490e3e700 1 -- 10.214.135.28:6805/29600 --> 10.214.131.35:6805/8353 -- osd_repop_reply(osd.0.0:9053 2.2 ondisk, result = 0) v1 -- ?+0 0x9ac8000 con 0x5072100
2015-01-13 23:56:28.044262 7fa490e3e700 1 -- 10.214.135.28:6805/29600 --> 10.214.131.35:6805/8353 -- osd_repop_reply(client.4124.0:26286 2.2 ondisk, result = 0) v1 -- ?+0 0x9acad00 con 0x5072100
2015-01-13 23:56:28.044300 7fa490e3e700 1 -- 10.214.135.28:6805/29600 --> 10.214.131.35:6805/8353 -- osd_repop_reply(client.4124.0:26287 2.2 ondisk, result = 0) v1 -- ?+0 0x9ac98c0 con 0x5072100
2015-01-13 23:56:28.044596 7fa490e3e700 1 -- 10.214.135.28:6805/29600 --> 10.214.131.35:6805/8353 -- osd_repop_reply(client.4124.0:26288 2.2 ondisk, result = 0) v1 -- ?+0 0x9ac9d40 con 0x5072100
2015-01-13 23:56:28.044798 7fa490e3e700 1 -- 10.214.135.28:6805/29600 --> 10.214.131.35:6805/8353 -- osd_repop_reply(client.4124.0:26289 2.2 ondisk, result = 0) v1 -- ?+0 0x9ac86c0 con 0x5072100
2015-01-13 23:56:28.044820 7fa490e3e700 1 -- 10.214.135.28:6805/29600 --> 10.214.131.35:6805/8353 -- osd_repop_reply(client.4124.0:26290 2.2 ondisk, result = 0) v1 -- ?+0 0x9acb840 con 0x5072100

#8 Updated by Sage Weil about 9 years ago

  • Subject changed from OSD crashes when evaluating the repop (ReplicatedPG::eval_repop) to messenger drops messages between osds
  • Source changed from other to Q/A

#9 Updated by Samuel Just about 9 years ago

/a/samuelj-2015-01-30_10:00:54-rados-wip-sam-working-wip-testing-vanilla-fixes-basic-multi/732275/remote/ceph-osd.msgs

The replica appears to send in sequence:
2015-01-30 11:13:51.032352 7fab381f7700 1 -- 10.214.132.20:6805/1905 --> 10.214.131.9:6809/23364 -- osd_repop_reply(osd.5.0:7012 2.2 ondisk, result = 0) v1 -- ?+0 0x6152d80 con 0x354d6e0
2015-01-30 11:13:51.032401 7fab381f7700 1 -- 10.214.132.20:6805/1905 --> 10.214.131.9:6809/23364 -- osd_repop_reply(osd.5.0:7013 2.2 ondisk, result = 0) v1 -- ?+0 0x6152b40 con 0x354d6e0
2015-01-30 11:13:51.032444 7fab381f7700 1 -- 10.214.132.20:6805/1905 --> 10.214.131.9:6809/23364 -- osd_repop_reply(osd.5.0:7014 2.2 ondisk, result = 0) v1 -- ?+0 0x6152900 con 0x354d6e0
2015-01-30 11:13:51.032490 7fab381f7700 1 -- 10.214.132.20:6805/1905 --> 10.214.131.9:6809/23364 -- osd_repop_reply(osd.5.0:7015 2.2 ondisk, result = 0) v1 -- ?+0 0x38b66c0 con 0x354d6e0
... (same for 7016-7031)
2015-01-30 11:13:51.033284 7fab381f7700 1 -- 10.214.132.20:6805/1905 --> 10.214.131.9:6809/23364 -- osd_repop_reply(osd.5.0:7032 2.2 ondisk, result = 0) v1 -- ?+0 0x34c06c0 con 0x354d6e0

The primary seems to receive:

2015-01-30 11:13:51.087705 7f8d9022f700 1 -- 10.214.131.9:6809/23364 <== osd.1 10.214.132.20:6805/1905 21886 ==== osd_repop_reply(osd.5.0:7012 2.2 ondisk, result = 0) v1 ==== 83+0+0 (1108995446 0 0) 0xb806ac0 con 0x508c940
2015-01-30 11:13:51.087748 7f8d9022f700 1 -- 10.214.131.9:6809/23364 <== osd.1 10.214.132.20:6805/1905 21887 ==== osd_repop_reply(osd.5.0:7013 2.2 ondisk, result = 0) v1 ==== 83+0+0 (1136980556 0 0) 0xb806880 con 0x508c940
2015-01-30 11:13:51.087790 7f8d9022f700 1 -- 10.214.131.9:6809/23364 <== osd.1 10.214.132.20:6805/1905 21906 ==== osd_repop_reply(osd.5.0:7032 2.2 ondisk, result = 0) v1 ==== 83+0+0 (2139817875 0 0) 0xb806640 con 0x508c940

missing the tids between 7013 and 7032. Also possibly interesting is that when the replica sends 7032, the primary is still getting acks at tid 6898, and has not yet gotten the first ondisk back at tid 6657.

#10 Updated by Greg Farnum about 9 years ago

That's odd; this run does include all of the messenger delay options enabled, which might be relevant to this example (although obviously not for the original report).

#11 Updated by Samuel Just over 8 years ago

  • Status changed from 12 to Can't reproduce
  • Regression set to No

Also available in: Atom PDF