Bug #8931
failed write reply order from ceph_test_rados
0%
Description
2014-07-24T15:14:02.281 INFO:teuthology.task.rados.rados.0.plana04.stdout:8519: finishing write tid 1 to plana0413881-8519 2014-07-24T15:14:02.281 INFO:teuthology.task.rados.rados.0.plana04.stderr:Error: finished tid 1 when last_acked_tid was 5 2014-07-24T15:14:02.282 INFO:teuthology.task.rados.rados.0.plana04.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7fe9167fc700 time 2014-07-24 15:14:02.280135 2014-07-24T15:14:02.282 INFO:teuthology.task.rados.rados.0.plana04.stderr:./test/osd/RadosModel.h: 828: FAILED assert(0) 2014-07-24T15:14:02.283 INFO:teuthology.task.rados.rados.0.plana04.stderr: ceph version 0.82-711-g7d13743 (7d137430aa5fea3311374769cff293f0c6d0d002) 2014-07-24T15:14:02.283 INFO:teuthology.task.rados.rados.0.plana04.stderr: 1: (WriteOp::_finish(TestOp::CallbackInfo*)+0x2e0) [0x417220] 2014-07-24T15:14:02.283 INFO:teuthology.task.rados.rados.0.plana04.stderr: 2: (write_callback(void*, void*)+0x19) [0x4272f9] 2014-07-24T15:14:02.283 INFO:teuthology.task.rados.rados.0.plana04.stderr: 3: (librados::C_AioSafe::finish(int)+0x1d) [0x7fe92272306d] 2014-07-24T15:14:02.284 INFO:teuthology.task.rados.rados.0.plana04.stderr: 4: (Context::complete(int)+0x9) [0x7fe9226ffdf9] 2014-07-24T15:14:02.284 INFO:teuthology.task.rados.rados.0.plana04.stderr: 5: (Finisher::finisher_thread_entry()+0x1b8) [0x7fe9227b1c48] 2014-07-24T15:14:02.284 INFO:teuthology.task.rados.rados.0.plana04.stderr: 6: (()+0x8182) [0x7fe922347182] 2014-07-24T15:14:02.284 INFO:teuthology.task.rados.rados.0.plana04.stderr: 7: (clone()+0x6d) [0x7fe921b5a30d]
http://pulpito.ceph.com/sage-2014-07-24_11:53:12-rados-master-testing-basic-plana/376363/
Associated revisions
osd/ReplicatedPG: requeue cache full waiters if no longer writeback
If the cache is full, we block some requests, and then we change the
cache_mode to something else (say, forward), the full waiters don't get
requeued until the cache becomes un-full. In the meantime, however, later
requests will get processed and redirected, breaking the op ordering.
Fix this by requeueing any full waiters if we see that the cache_mode is
not writeback.
Fixes: #8931
Signed-off-by: Sage Weil <sage@redhat.com>
osd/ReplicatedPG: requeue cache full waiters if no longer writeback
If the cache is full, we block some requests, and then we change the
cache_mode to something else (say, forward), the full waiters don't get
requeued until the cache becomes un-full. In the meantime, however, later
requests will get processed and redirected, breaking the op ordering.
Fix this by requeueing any full waiters if we see that the cache_mode is
not writeback.
Fixes: #8931
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 8fb761b660c268e2264d375a4db2f659a5c3a107)
History
#1 Updated by Sage Weil over 9 years ago
- Status changed from New to In Progress
- Backport set to firefly
- writeback mode
- write 1 received
- put on full list
- mode changes to forward
- write 2 recieved
- forwarded
- pool becomes unfull
- write 2 requeued, forwarded.
we need to requeue full waiters when the cache_mode changes.
#2 Updated by Sage Weil over 9 years ago
- Status changed from In Progress to 7
#3 Updated by Sage Weil over 9 years ago
- Status changed from 7 to Fix Under Review
#4 Updated by Greg Farnum over 9 years ago
- Status changed from Fix Under Review to Pending Backport
Merged to master in 050ac87530c2637f097e07b5373115721303f07c
#5 Updated by Sage Weil over 9 years ago
- Status changed from Pending Backport to Resolved