Bug #9064
RadosModel assertion failure
0%
Description
2014-08-10T13:16:56.604 INFO:tasks.rados.rados.0.plana54.stdout:3005: finishing write tid 1 to plana5416145-3005 2014-08-10T13:16:56.604 INFO:tasks.rados.rados.0.plana54.stderr:Error: finished tid 1 when last_acked_tid was 5 2014-08-10T13:16:56.604 INFO:tasks.rados.rados.0.plana54.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7fde267fc700 time 2014-08-10 13:16:56.599123 2014-08-10T13:16:56.605 INFO:tasks.rados.rados.0.plana54.stderr:./test/osd/RadosModel.h: 828: FAILED assert(0) 2014-08-10T13:16:56.605 INFO:tasks.rados.rados.0.plana54.stderr: ceph version 0.83-697-g7be5977 (7be59771bff09e2b46b5467627cbd7bb8b60727a) 2014-08-10T13:16:56.605 INFO:tasks.rados.rados.0.plana54.stderr: 1: (WriteOp::_finish(TestOp::CallbackInfo*)+0x318) [0x419ff8] 2014-08-10T13:16:56.605 INFO:tasks.rados.rados.0.plana54.stderr: 2: (write_callback(void*, void*)+0x21) [0x427991] 2014-08-10T13:16:56.605 INFO:tasks.rados.rados.0.plana54.stderr: 3: (librados::C_AioSafe::finish(int)+0x1d) [0x7fde32afed7d] 2014-08-10T13:16:56.605 INFO:tasks.rados.rados.0.plana54.stderr: 4: (Context::complete(int)+0x9) [0x7fde32adc229] 2014-08-10T13:16:56.605 INFO:tasks.rados.rados.0.plana54.stderr: 5: (Finisher::finisher_thread_entry()+0x1c0) [0x7fde32b99120] 2014-08-10T13:16:56.606 INFO:tasks.rados.rados.0.plana54.stderr: 6: (()+0x7e9a) [0x7fde3273fe9a] 2014-08-10T13:16:56.606 INFO:tasks.rados.rados.0.plana54.stderr: 7: (clone()+0x6d) [0x7fde31f563fd]
Log messages for the object in question:
2014-08-10T13:16:52.589 INFO:tasks.rados.rados.0.plana54.stdout:3005: write initial oid 3005 2014-08-10T13:16:52.589 INFO:tasks.rados.rados.0.plana54.stdout:3005: seq_num 3004 ranges {729974=423607,1867088=462886} 2014-08-10T13:16:52.621 INFO:tasks.rados.rados.0.plana54.stdout:3005: writing plana5416145-3005 from 729974 to 1153581 tid 1 2014-08-10T13:16:52.650 INFO:tasks.rados.rados.0.plana54.stdout:3005: writing plana5416145-3005 from 1867088 to 2329974 tid 2 2014-08-10T13:16:54.924 INFO:tasks.rados.rados.0.plana54.stdout:3005: finishing write tid 5 to plana5416145-3005 2014-08-10T13:16:56.604 INFO:tasks.rados.rados.0.plana54.stdout:3005: finishing write tid 1 to plana5416145-3005
So that 5 is coming from... somewhere else. This seems less like an ordering issue and more that something is simply getting corrupted as we're seeing a tid complete that we never saw start for that object.
Related issues
Associated revisions
ReplicatedPG::maybe_handle_cache: do not skip promote for write_ordered
We cannot redirect a RW ordered read.
Fixes: #9064
Introduced: 0ed3adc1e0a74bf9548d1d956aece11f019afee0
Signed-off-by: Samuel Just <sam.just@inktank.com>
ReplicatedPG::maybe_handle_cache: do not skip promote for write_ordered
We cannot redirect a RW ordered read.
Fixes: #9064
Introduced: 0ed3adc1e0a74bf9548d1d956aece11f019afee0
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 24aeca9c439474b079d23600d8d42c0af5d9ce5c)
History
#1 Updated by John Spray over 9 years ago
http://qa-proxy.ceph.com/teuthology/john-2014-08-10_02:14:59-rados-wip-mds-contexts-testing-basic-plana/411119/teuthology.log
http://qa-proxy.ceph.com/teuthology/john-2014-08-10_02:14:59-rados-wip-mds-contexts-testing-basic-plana/411120/teuthology.log
http://qa-proxy.ceph.com/teuthology/john-2014-08-10_02:14:59-rados-wip-mds-contexts-testing-basic-plana/411159/teuthology.log
#2 Updated by John Spray over 9 years ago
- Description updated (diff)
#3 Updated by John Spray over 9 years ago
I understand this a little better now: the operations in this WriteOp are 1,2 (writes), 4 (setxattr), 5 (read). So the read is apparently completing before any of the writes.
#4 Updated by John Spray over 9 years ago
trying to reproduce locally with objecter logging turned up and ``ms inject socket failures`` enabled as it is in the failing cases.
#5 Updated by John Spray over 9 years ago
Got an even more specific backtrace
2014-08-11 19:35:36,654.654 INFO:teuthology.task.rados.rados.0.mira114.stdout:update_object_version oid 2506 v 1285 (ObjNum 2505 snap 0 seq_num 2505) dirty exists 2014-08-11 19:35:36,654.654 INFO:teuthology.task.rados.rados.0.mira114.stderr:Error: racing read on 2506 returned version 0 rather than version 1285 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7fec1b7fe700 time 2014-08-11 11:34:44.425107 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr:./test/osd/RadosModel.h: 847: FAILED assert(0 == "racing read got wrong version") 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr: ceph version 0.83-704-g1f44a2e (1f44a2ed1780ff8c04d5d040ccc694f31ab8f509) 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr: 1: ceph_test_rados() [0x41a18b] 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr: 2: (write_callback(void*, void*)+0x21) [0x4279b1] 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr: 3: (librados::C_AioSafe::finish(int)+0x1d) [0x7fec2f994d7d] 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr: 4: (Context::complete(int)+0x9) [0x7fec2f972229] 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr: 5: (Finisher::finisher_thread_entry()+0x1c0) [0x7fec2fa2f120] 2014-08-11 19:35:36,655.655 INFO:teuthology.task.rados.rados.0.mira114.stderr: 6: (()+0x7e9a) [0x7fec2f5d5e9a] 2014-08-11 19:35:36,656.656 INFO:teuthology.task.rados.rados.0.mira114.stderr: 7: (clone()+0x6d) [0x7fec2edec3fd] 2014-08-11 19:35:36,656.656 INFO:teuthology.task.rados.rados.0.mira114.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
#6 Updated by John Spray over 9 years ago
The bug is happening when a new OSD map is received in the middle of the series of transactions. The read transaction is getting redirected to another OSD, while the write transactions are responded to by the original OSD. This results in the writes and reads flowing via different OSD sessions (and threads) in the objecter.
Given that this was triggered by changes to the client side code, I presume the way this is meant to work is that it's the client's responsibility to sort out this ordering, even though in fact it is receiving the responses out of order (responses are ordered correctly from each individual OSD's POV but because they're interleaved from two OSDs they arrive in undefined order at the client).
Here's the exception:
2014-08-11T19:51:42.986 INFO:teuthology.task.rados.rados.0.mira114.stdout:2468: finishing write tid 4 to mira11419081-2468 2014-08-11T19:51:42.986 INFO:teuthology.task.rados.rados.0.mira114.stdout:update_object_version oid 2468 v 1973 (ObjNum 2467 snap 0 seq_num 2467) dirty exists 2014-08-11T19:51:42.986 INFO:teuthology.task.rados.rados.0.mira114.stderr:Error: racing read on 2468 returned version 0 rather than version 1973 2014-08-11T19:51:42.986 INFO:teuthology.task.rados.rados.0.mira114.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f0779ffb700 time 2014-08-11 11:50:50.737857 2014-08-11T19:51:42.986 INFO:teuthology.task.rados.rados.0.mira114.stderr:./test/osd/RadosModel.h: 847: FAILED assert(0 == "racing read got wrong version") 2014-08-11T19:51:42.986 INFO:teuthology.task.rados.rados.0.mira114.stderr: ceph version 0.83-704-g1f44a2e (1f44a2ed1780ff8c04d5d040ccc694f31ab8f509) 2014-08-11T19:51:42.987 INFO:teuthology.task.rados.rados.0.mira114.stderr: 1: ceph_test_rados() [0x41a18b] 2014-08-11T19:51:42.987 INFO:teuthology.task.rados.rados.0.mira114.stderr: 2: (write_callback(void*, void*)+0x21) [0x4279b1] 2014-08-11T19:51:42.987 INFO:teuthology.task.rados.rados.0.mira114.stderr: 3: (librados::C_AioSafe::finish(int)+0x1d) [0x7f0786470d7d] 2014-08-11T19:51:42.987 INFO:teuthology.task.rados.rados.0.mira114.stderr: 4: (Context::complete(int)+0x9) [0x7f078644e229] 2014-08-11T19:51:42.987 INFO:teuthology.task.rados.rados.0.mira114.stderr: 5: (Finisher::finisher_thread_entry()+0x1c0) [0x7f078650b120] 2014-08-11T19:51:42.987 INFO:teuthology.task.rados.rados.0.mira114.stderr: 6: (()+0x7e9a) [0x7f07860b1e9a] 2014-08-11T19:51:42.987 INFO:teuthology.task.rados.rados.0.mira114.stderr: 7: (clone()+0x6d) [0x7f07858c83fd]
This is the log from the ceph_test_rados process for object mira11419081-2468:
9064_teuthology.log:2014-08-11T19:51:34.695 INFO:teuthology.task.rados.rados.0.mira114.stdout:2468: writing mira11419081-2468 from 729525 to 1364279 tid 1 9064_teuthology.log:2014-08-11T19:51:34.695 INFO:teuthology.task.rados.rados.0.mira114.stdout:2468: writing mira11419081-2468 from 1861328 to 2329525 tid 2 9064_teuthology.log:2014-08-11T19:51:37.708 INFO:teuthology.task.rados.rados.0.mira114.stdout:2468: finishing write tid 5 to mira11419081-2468 9064_teuthology.log:2014-08-11T19:51:42.982 INFO:teuthology.task.rados.rados.0.mira114.stdout:2468: finishing write tid 1 to mira11419081-2468 9064_teuthology.log:2014-08-11T19:51:42.985 INFO:teuthology.task.rados.rados.0.mira114.stdout:2468: finishing write tid 2 to mira11419081-2468 9064_teuthology.log:2014-08-11T19:51:42.986 INFO:teuthology.task.rados.rados.0.mira114.stdout:2468: finishing write tid 4 to mira11419081-2468
9064_ceph-client.0.19081.log:2014-08-11 11:50:42.427728 7f0787408780 10 client.4119.objecter _op_submit oid mira11419081-2468 @1 @2 [write 729525~634754] tid 0 osd.2 9064_ceph-client.0.19081.log:2014-08-11 11:50:42.444512 7f0787408780 10 client.4119.objecter _op_submit oid mira11419081-2468 @1 @2 [write 1861328~468197] tid 0 osd.2 9064_ceph-client.0.19081.log:2014-08-11 11:50:42.444572 7f0787408780 10 client.4119.objecter _op_submit oid mira11419081-2468 @1 @2 [setxattr _header (56),truncate 2329525] tid 0 osd.2 9064_ceph-client.0.19081.log:2014-08-11 11:50:42.444598 7f0787408780 10 client.4119.objecter _op_submit oid mira11419081-2468 @1 @2 [read 0~1] tid 0 osd.2
This is the log from the client log with debug_objecter=20 for object mira11419081-2468:
9064_ceph-client.0.19081.log:2014-08-11 11:50:45.451847 7f07817bc700 20 client.4119.objecter 14622 2.beafe6c5 osd.2 mira11419081-2468 [write 729525~634754] 9064_ceph-client.0.19081.log:2014-08-11 11:50:45.451855 7f07817bc700 20 client.4119.objecter 14623 2.beafe6c5 osd.2 mira11419081-2468 [write 1861328~468197] 9064_ceph-client.0.19081.log:2014-08-11 11:50:45.451863 7f07817bc700 20 client.4119.objecter 14624 2.beafe6c5 osd.2 mira11419081-2468 [setxattr _header (56),truncate 2329525] 9064_ceph-client.0.19081.log:2014-08-11 11:50:45.451872 7f07817bc700 20 client.4119.objecter 14625 2.beafe6c5 osd.2 mira11419081-2468 [read 0~1] 9064_ceph-client.0.19081.log:2014-08-11 11:50:45.460331 7f07801b4700 10 client.4119.objecter ms_dispatch 0x1d40440 osd_op_reply(14625 mira11419081-2468 [read 0~1] v0'0 uv0 ack = -2 ((2) No such file or directory) redirect: { object , locator{@1} }) v6 9064_ceph-client.0.19081.log:2014-08-11 11:50:45.460405 7f07801b4700 10 client.4119.objecter _op_submit oid mira11419081-2468 @1 @1 [read 0~1] tid 0 osd.5 9064_ceph-client.0.19081.log:2014-08-11 11:50:45.463611 7f07803b6700 10 client.4119.objecter ms_dispatch 0x1d40440 osd_op_reply(14793 mira11419081-2468 [read 0~1] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6 9064_ceph-client.0.19081.log:2014-08-11 11:50:50.733817 7f07801b4700 10 client.4119.objecter ms_dispatch 0x1d40440 osd_op_reply(14622 mira11419081-2468 [write 729525~634754] v21'1179 uv1971 ondisk = 0) v6 9064_ceph-client.0.19081.log:2014-08-11 11:50:50.737458 7f07801b4700 10 client.4119.objecter ms_dispatch 0x1d40440 osd_op_reply(14623 mira11419081-2468 [write 1861328~468197] v21'1180 uv1972 ondisk = 0) v6 9064_ceph-client.0.19081.log:2014-08-11 11:50:50.737724 7f07801b4700 10 client.4119.objecter ms_dispatch 0x1d40440 osd_op_reply(14624 mira11419081-2468 [setxattr (56),truncate 2329525] v21'1181 uv1973 ondisk = 0) v6
The ops initially issued are:
2014-08-11 11:50:42.427747 7f0787408780 15 client.4119.objecter _send_op 14622 to osd.2 2014-08-11 11:50:42.444529 7f0787408780 15 client.4119.objecter _send_op 14623 to osd.2 2014-08-11 11:50:42.444576 7f0787408780 15 client.4119.objecter _send_op 14624 to osd.2 2014-08-11 11:50:42.444601 7f0787408780 15 client.4119.objecter _send_op 14625 to osd.2
Then we get a new OSD map:
2014-08-11 11:50:45.450774 7f07817bc700 3 client.4119.objecter handle_osd_map decoding incremental epoch 21
The read transaction (14625) is redirected via osd.5:
2014-08-11 11:50:45.460331 7f07801b4700 10 client.4119.objecter ms_dispatch 0x1d40440 osd_op_reply(14625 mira11419081-2468 [read 0~1] v0'0 uv0 ack = -2 ((2) No such file or directory) redirect: { object , locator{@1} }) v6 2014-08-11 11:50:45.460361 7f07801b4700 7 client.4119.objecter handle_osd_op_reply 14625 ack v 0'0 uv 0 in 2.beafe6c5 attempt 0 2014-08-11 11:50:45.460369 7f07801b4700 5 client.4119.objecter got redirect reply; redirecting 2014-08-11 11:50:45.460375 7f07801b4700 10 client.4119.objecter _op_submit op 0x21c66a0 2014-08-11 11:50:45.460392 7f07801b4700 10 client.4119.objecter _calc_target pgid 1.beafe6c5 acting [5,0] 2014-08-11 11:50:45.460401 7f07801b4700 20 client.4119.objecter note: not requesting commit 2014-08-11 11:50:45.460405 7f07801b4700 10 client.4119.objecter _op_submit oid mira11419081-2468 @1 @1 [read 0~1] tid 0 osd.5 2014-08-11 11:50:45.460419 7f07801b4700 15 client.4119.objecter _send_op 14793 to osd.5 2014-08-11 11:50:45.463611 7f07803b6700 10 client.4119.objecter ms_dispatch 0x1d40440 osd_op_reply(14793 mira11419081-2468 [read 0~1] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6 2014-08-11 11:50:45.463634 7f07803b6700 7 client.4119.objecter handle_osd_op_reply 14793 ack v 0'0 uv 0 in 1.beafe6c5 attempt 1 2014-08-11 11:50:45.463651 7f07803b6700 15 client.4119.objecter handle_osd_op_reply completed tid 14793 2014-08-11 11:50:45.463656 7f07803b6700 15 client.4119.objecter finish_op 14793
Whereas a write transaction (e.g. 14622) is not:
2014-08-11 11:50:50.733817 7f07801b4700 10 client.4119.objecter ms_dispatch 0x1d40440 osd_op_reply(14622 mira11419081-2468 [write 729525~634754] v21'1179 uv1971 ondisk = 0) v6 2014-08-11 11:50:50.733825 7f07801b4700 7 client.4119.objecter handle_osd_op_reply 14622 ondisk v 21'1179 uv 1971 in 2.beafe6c5 attempt 0 2014-08-11 11:50:50.733834 7f07801b4700 15 client.4119.objecter handle_osd_op_reply completed tid 14622 2014-08-11 11:50:50.733835 7f07801b4700 15 client.4119.objecter finish_op 14622
#7 Updated by John Spray over 9 years ago
This just reproduced on master 78dc4df, so looks like it's not wip-objecter specific.
#8 Updated by John Spray over 9 years ago
- Subject changed from (wip-objecter) RadosModel assertion failure to RadosModel assertion failure
#9 Updated by Samuel Just over 9 years ago
- Assignee set to Samuel Just
- Priority changed from Normal to Urgent
#10 Updated by John Spray over 9 years ago
- File config.yaml View added
- Assignee deleted (
Samuel Just) - Priority changed from Urgent to Normal
I've now seen this in a case where the client wasn't in the process of handling a new OSD map (but the server was), so I think this is probably an issue in the OSD sending redirects for old-epoch requests that it shouldn't be responding to.
There are logs from a failure on master in ~/jcsp/9064 on the teuthology server (this was run by hand hence the unconventional location).
#11 Updated by John Spray over 9 years ago
- Assignee set to Samuel Just
- Priority changed from Normal to Urgent
#12 Updated by Samuel Just over 9 years ago
Got it: 0ed3adc1e0a74bf9548d1d956aece11f019afee0
We're redirecting RW ordered reads due to the second read promote thing.
#13 Updated by Samuel Just over 9 years ago
wip-9064
#14 Updated by Samuel Just over 9 years ago
- Status changed from New to 7
#15 Updated by Sage Weil over 9 years ago
- Status changed from 7 to Pending Backport
#16 Updated by Samuel Just over 9 years ago
- Status changed from Pending Backport to Resolved
#17 Updated by Loïc Dachary almost 9 years ago
- Status changed from Resolved to Pending Backport
- Backport set to firefly
- Regression set to No
2015-06-24 10:13:34,513.513 INFO:tasks.rados.rados.0.entercloudsuite012.stdout:695: finishing write tid 1 to entercloudsuite01212574-27 2015-06-24 10:13:34,513.513 INFO:tasks.rados.rados.0.entercloudsuite012.stderr:Error: finished tid 1 when last_acked_tid was 5 2015-06-24 10:13:34,514.514 INFO:tasks.rados.rados.0.entercloudsuite012.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7fc4e67fc700 time 201\ 5-06-24 20:00:44.189448 2015-06-24 10:13:34,514.514 INFO:tasks.rados.rados.0.entercloudsuite012.stderr:./test/osd/RadosModel.h: 828: FAILED assert(0) 2015-06-24 10:13:34,515.515 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: ceph version 0.80.10-2-g1623e6c (1623e6c6517ff6958b343493fce6a8e50a52a042) 2015-06-24 10:13:34,519.519 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: 1: (WriteOp::_finish(TestOp::CallbackInfo*)+0x2e0) [0x417220] 2015-06-24 10:13:34,519.519 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: 2: (write_callback(void*, void*)+0x19) [0x4274b9] 2015-06-24 10:13:34,519.519 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: 3: (librados::C_AioSafe::finish(int)+0x1d) [0x7fc4f1a502bd] 2015-06-24 10:13:34,519.519 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: 4: (Context::complete(int)+0x9) [0x7fc4f1a2d089] 2015-06-24 10:13:34,519.519 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: 5: (Finisher::finisher_thread_entry()+0x1b8) [0x7fc4f1adeb68] 2015-06-24 10:13:34,520.520 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: 6: (()+0x8182) [0x7fc4f1675182] 2015-06-24 10:13:34,520.520 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: 7: (clone()+0x6d) [0x7fc4f0e8847d] 2015-06-24 10:13:34,520.520 INFO:tasks.rados.rados.0.entercloudsuite012.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2015-06-24 10:13:34,520.520 INFO:tasks.rados.rados.0.entercloudsuite012.stderr:terminate called after throwing an instance of 'ceph::FailedAssertion'
While running upgrade:firefly-x/parallel/{0-cluster/start.yaml 1-firefly-install/firefly.yaml 2-workload/{ec-rados-parallel.yaml rados_api.yaml rados_loadgenbig.yaml test_rbd_api.yaml test_rbd_python.yaml} 3-upgrade-sequence/upgrade-all.yaml 4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} distros/ubuntu_14.04.yaml}
#18 Updated by Loïc Dachary over 8 years ago
The commit fixing this issue http://tracker.ceph.com/projects/ceph/repository/revisions/24aeca9c439474b079d23600d8d42c0af5d9ce5c refers to 0ed3adc which was introduced by http://tracker.ceph.com/issues/8674#note-12
#19 Updated by Nathan Cutler over 8 years ago
- Status changed from Pending Backport to Resolved