Project

General

Profile

Actions

Bug #19133

closed

osd ops (sent and?) arrive at osd out of order

Added by Sage Weil about 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
jewel, kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2017-03-03 00:45:10.853845 7f6155b81700 10 osd.5 95  new session 0x7f618d7bd520 con=0x7f618d59c780 addr=172.21.15.51:0/3868508237
2017-03-03 00:45:10.853874 7f6155b81700 10 osd.5 95  session 0x7f618d7bd520 client.0 has caps osdcap[grant(rwx)] 'allow rwx'
2017-03-03 00:45:10.860242 7f6155b81700  1 -- 172.21.15.29:6800/25665 <== client.24416 172.21.15.51:0/3868508237 1 ==== osd_op(client.24416.0:8 4.ba2d0104 (undecoded) ack+read+rwordered+known_if_redirected e95) v7 ==== 202+0+0 (1750122905 0 0) 0x7f618cc38940 con 0x
7f618d59c780
2017-03-03 00:45:12.607800 7f6155b81700  1 -- 172.21.15.29:6800/25665 <== client.24416 172.21.15.51:0/3868508237 2 ==== osd_op(client.24416.0:5 4.ba2d0104 (undecoded) ack+ondisk+write+known_if_redirected e96) v7 ==== 240+0+475249 (1354842370 0 3460332877) 0x7f618d3ea680 con 0x7f618d59c780

this is jewel-x upgrade singleton, /a/sage-2017-03-03_00:23:40-rados-wip-osd-full---basic-smithi/875693

the client is ceph_test_rados from jewel, 10.2.5-6122-g420a9a0

2017-03-03T00:45:12.612 INFO:tasks.rados.rados.0.smithi051.stdout:2:  finishing write tid 1 to smithi05121611-2
2017-03-03T00:45:12.613 INFO:tasks.rados.rados.0.smithi051.stderr:Error: finished tid 1 when last_acked_tid was 5
2017-03-03T00:45:12.613 INFO:tasks.rados.rados.0.smithi051.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f47f2ffd700 time 2017-03-03 00:45:12.613911
2017-03-03T00:45:12.613 INFO:tasks.rados.rados.0.smithi051.stderr:./test/osd/RadosModel.h: 865: FAILED assert(0)
2017-03-03T00:45:12.626 INFO:tasks.rados.rados.0.smithi051.stderr: ceph version 10.2.5-6122-g420a9a0 (420a9a0796e327676cb704bc18b2020f9941d7b6)
2017-03-03T00:45:12.626 INFO:tasks.rados.rados.0.smithi051.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f4809aeae2b]
2017-03-03T00:45:12.626 INFO:tasks.rados.rados.0.smithi051.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x4b8) [0x7f4809ad51b8]
2017-03-03T00:45:12.626 INFO:tasks.rados.rados.0.smithi051.stderr: 3: (write_callback(void*, void*)+0x19) [0x7f4809ae20e9]
2017-03-03T00:45:12.626 INFO:tasks.rados.rados.0.smithi051.stderr: 4: (()+0x9abcd) [0x7f48000f1bcd]
2017-03-03T00:45:12.626 INFO:tasks.rados.rados.0.smithi051.stderr: 5: (()+0x84399) [0x7f48000db399]
2017-03-03T00:45:12.626 INFO:tasks.rados.rados.0.smithi051.stderr: 6: (()+0x1693e6) [0x7f48001c03e6]
2017-03-03T00:45:12.626 INFO:tasks.rados.rados.0.smithi051.stderr: 7: (()+0x8184) [0x7f47ffa27184]
2017-03-03T00:45:12.627 INFO:tasks.rados.rados.0.smithi051.stderr: 8: (clone()+0x6d) [0x7f47fe39537d]

oh, there was a full transition. client log is

2017-03-03 00:45:10.842521 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813eca160 tid 0
2017-03-03 00:45:10.849290 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ecd890 tid 0
2017-03-03 00:45:10.849313 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ed0380 tid 0
2017-03-03 00:45:10.854969 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ed0b70 tid 0
2017-03-03 00:45:10.861725 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ed4da0 tid 0
2017-03-03 00:45:10.861740 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ed7730 tid 0
2017-03-03 00:45:10.867943 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ed9070 tid 0
2017-03-03 00:45:10.874244 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813edc6e0 tid 0
2017-03-03 00:45:10.874259 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813edf1d0 tid 0
2017-03-03 00:45:10.880997 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813edf9c0 tid 0
2017-03-03 00:45:10.886832 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ee3c20 tid 0
2017-03-03 00:45:10.886846 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ee65b0 tid 0
2017-03-03 00:45:10.894152 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ee8000 tid 0
2017-03-03 00:45:10.899516 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813eeb580 tid 0
2017-03-03 00:45:10.899531 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813eee070 tid 0
2017-03-03 00:45:10.907506 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813eee860 tid 0
2017-03-03 00:45:10.912282 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ef5ab0 tid 0
2017-03-03 00:45:10.912296 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ef8440 tid 0
2017-03-03 00:45:10.918235 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813ef9d80 tid 0
2017-03-03 00:45:10.925065 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813efa560 tid 0
2017-03-03 00:45:10.925081 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813efd050 tid 0
2017-03-03 00:45:10.931547 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813efd840 tid 0
2017-03-03 00:45:10.938048 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813efec10 tid 0
2017-03-03 00:45:10.938059 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f015a0 tid 0
2017-03-03 00:45:10.945840 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f03000 tid 0
2017-03-03 00:45:10.950982 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f036f0 tid 0
2017-03-03 00:45:10.950994 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f061e0 tid 0
2017-03-03 00:45:10.959293 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f069d0 tid 0
2017-03-03 00:45:10.964026 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f07d30 tid 0
2017-03-03 00:45:10.964037 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f0a6c0 tid 0
2017-03-03 00:45:10.970449 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f0c110 tid 0
2017-03-03 00:45:10.977393 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f0c800 tid 0
2017-03-03 00:45:10.977409 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f0f2c0 tid 0
2017-03-03 00:45:10.984339 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f0fad0 tid 0
2017-03-03 00:45:10.990837 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f10ea0 tid 0
2017-03-03 00:45:10.990848 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f13830 tid 0
2017-03-03 00:45:10.998294 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f15170 tid 0
2017-03-03 00:45:11.004325 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f15950 tid 0
2017-03-03 00:45:11.004337 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f18440 tid 0
2017-03-03 00:45:11.012497 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f18c30 tid 0
2017-03-03 00:45:11.017860 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f19f90 tid 0
2017-03-03 00:45:11.017870 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f1c920 tid 0
2017-03-03 00:45:11.026548 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f1e260 tid 0
2017-03-03 00:45:11.031356 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f1ea40 tid 0
2017-03-03 00:45:11.031372 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f21530 tid 0
2017-03-03 00:45:11.038202 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f21d20 tid 0
2017-03-03 00:45:11.044735 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f230f0 tid 0
2017-03-03 00:45:11.044747 7f4809e00740  0 client.24416.objecter  FULL, paused modify 0x7f4813f25a80 tid 0

Related issues 4 (0 open4 closed)

Related to Ceph - Bug #19139: osdc/Objecter: If osd full, it should pause read op which w/ rwordered flagResolved03/03/2017

Actions
Related to Ceph - Bug #19430: objecter: full_try behavior not consistent with osdResolved03/30/2017

Actions
Copied to Ceph - Backport #19224: jewel: osd ops (sent and?) arrive at osd out of orderResolvedDavid ZafmanActions
Copied to Ceph - Backport #19225: kraken: osd ops (sent and?) arrive at osd out of orderRejectedActions
Actions

Also available in: Atom PDF