Project

General

Profile

Actions

Bug #46822

closed

ObjectCacher with read-ahead and overwrites might result in missed wake-up

Added by Jason Dillaman almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus,octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

With the legacy ObjectCacher cache enabled, it's possible for a read-ahead to have queued a RX (w/ an associated "BufferHead::waitfor_read" entry), an overwrite of that BH which starts before the read-ahead offset will never finalize the waiters because "bh_read_finalize" will perform a lower bound which will miss the expanded BH from the overwrite.

2020-07-31T16:38:52.161 INFO:tasks.workunit.client.0.smithi060.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-3970-g96b86ff196a/rpm/el8/BUILD/ceph-16.0.0-3970-g96b86ff196a/src/osdc/ObjectCacher.cc: In function 'void ObjectCacher::Object::discard(loff_t, loff_t, C_GatherBuilder*)' thread 7f436bfff700 time 2020-07-31T16:38:52.156442+0000
2020-07-31T16:38:52.162 INFO:tasks.workunit.client.0.smithi060.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-3970-g96b86ff196a/rpm/el8/BUILD/ceph-16.0.0-3970-g96b86ff196a/src/osdc/ObjectCacher.cc: 643: FAILED ceph_assert(bh->waitfor_read.empty())
2020-07-31T16:38:52.162 INFO:tasks.workunit.client.0.smithi060.stderr: ceph version 16.0.0-3970-g96b86ff196a (96b86ff196a7e74005b92d792552513edb5c3523) pacific (dev)
2020-07-31T16:38:52.163 INFO:tasks.workunit.client.0.smithi060.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f43c3f78eb6]
2020-07-31T16:38:52.163 INFO:tasks.workunit.client.0.smithi060.stderr: 2: (()+0x2760d0) [0x7f43c3f790d0]
2020-07-31T16:38:52.163 INFO:tasks.workunit.client.0.smithi060.stderr: 3: (ObjectCacher::Object::discard(long, long, C_GatherBuilderBase<Context, C_GatherBase<Context, Context> >*)+0x71e) [0x55ce44f0c23e]
2020-07-31T16:38:52.163 INFO:tasks.workunit.client.0.smithi060.stderr: 4: (ObjectCacher::_discard(ObjectCacher::ObjectSet*, std::vector<ObjectExtent, std::allocator<ObjectExtent> > const&, C_GatherBuilderBase<Context, C_GatherBase<Context, Context> >*)+0x120) [0x55ce44f0c410]
2020-07-31T16:38:52.163 INFO:tasks.workunit.client.0.smithi060.stderr: 5: (ObjectCacher::discard_writeback(ObjectCacher::ObjectSet*, std::vector<ObjectExtent, std::allocator<ObjectExtent> > const&, Context*)+0x5b) [0x55ce44f1349b]
2020-07-31T16:38:52.164 INFO:tasks.workunit.client.0.smithi060.stderr: 6: (librbd::cache::ObjectCacherObjectDispatch<librbd::ImageCtx>::discard(unsigned long, unsigned long, unsigned long, SnapContext const&, int, ZTracer::Trace const&, int*, unsigned long*, librbd::io::DispatchResult*, Context**, Context*)+0x707) [0x55ce44da2237]
2020-07-31T16:38:52.164 INFO:tasks.workunit.client.0.smithi060.stderr: 7: (librbd::io::ObjectDispatcher<librbd::ImageCtx>::send_dispatch(librbd::io::ObjectDispatchInterface*, librbd::io::ObjectDispatchSpec*)+0x20f) [0x55ce44c85ddf]
2020-07-31T16:38:52.164 INFO:tasks.workunit.client.0.smithi060.stderr: 8: (librbd::io::Dispatcher<librbd::ImageCtx, librbd::io::ObjectDispatcherInterface>::send(librbd::io::ObjectDispatchSpec*)+0xee) [0x55ce44c8803e]
2020-07-31T16:38:52.164 INFO:tasks.workunit.client.0.smithi060.stderr: 9: (librbd::io::AbstractImageWriteRequest<librbd::ImageCtx>::send_object_requests(boost::container::small_vector<striper::LightweightObjectExtent, 4ul, void, void> const&, SnapContext const&, unsigned long)+0x413) [0x55ce44c83bd3]
2020-07-31T16:38:52.164 INFO:tasks.workunit.client.0.smithi060.stderr: 10: (librbd::io::AbstractImageWriteRequest<librbd::ImageCtx>::send_request()+0x1fd) [0x55ce44c83f1d]
2020-07-31T16:38:52.165 INFO:tasks.workunit.client.0.smithi060.stderr: 11: (librbd::io::ImageRequest<librbd::ImageCtx>::send()+0x134) [0x55ce44c82584]
2020-07-31T16:38:52.165 INFO:tasks.workunit.client.0.smithi060.stderr: 12: (librbd::io::ImageRequest<librbd::ImageCtx>::aio_discard(librbd::ImageCtx*, librbd::io::AioCompletion*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&&, unsigned int, ZTracer::Trace const&)+0x76) [0x55ce44c82b36]
2020-07-31T16:38:52.165 INFO:tasks.workunit.client.0.smithi060.stderr: 13: (librbd::io::ImageDispatch<librbd::ImageCtx>::discard(librbd::io::AioCompletion*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&&, unsigned int, ZTracer::Trace const&, unsigned long, std::atomic<unsigned int>*, librbd::io::DispatchResult*, Context*)+0x88) [0x55ce44de3488]
2020-07-31T16:38:52.165 INFO:tasks.workunit.client.0.smithi060.stderr: 14: (librbd::io::ImageDispatcher<librbd::ImageCtx>::send_dispatch(librbd::io::ImageDispatchInterface*, librbd::io::ImageDispatchSpec<librbd::ImageCtx>*)+0x16e) [0x55ce44c7976e]
2020-07-31T16:38:52.165 INFO:tasks.workunit.client.0.smithi060.stderr: 15: (librbd::io::Dispatcher<librbd::ImageCtx, librbd::io::ImageDispatcherInterface>::send(librbd::io::ImageDispatchSpec<librbd::ImageCtx>*)+0xee) [0x55ce44c7b5ee]
2020-07-31T16:38:52.166 INFO:tasks.workunit.client.0.smithi060.stderr: 16: (librbd::io::ImageDispatchSpec<librbd::ImageCtx>::C_Dispatcher::complete(int)+0x7d) [0x55ce44c7921d]
2020-07-31T16:38:52.166 INFO:tasks.workunit.client.0.smithi060.stderr: 17: (()+0x8100a2) [0x55ce44d8b0a2]
2020-07-31T16:38:52.166 INFO:tasks.workunit.client.0.smithi060.stderr: 18: (()+0xbeb72) [0x7f43cd4f8b72]
2020-07-31T16:38:52.166 INFO:tasks.workunit.client.0.smithi060.stderr: 19: (()+0xc34ea) [0x7f43cd4fd4ea]
2020-07-31T16:38:52.166 INFO:tasks.workunit.client.0.smithi060.stderr: 20: (()+0xc2b23) [0x7f43c2422b23]
2020-07-31T16:38:52.167 INFO:tasks.workunit.client.0.smithi060.stderr: 21: (()+0x82de) [0x7f43cd2222de]

Related issues 2 (0 open2 closed)

Copied to rbd - Backport #47705: nautilus: ObjectCacher with read-ahead and overwrites might result in missed wake-upResolvedNathan CutlerActions
Copied to rbd - Backport #47706: octopus: ObjectCacher with read-ahead and overwrites might result in missed wake-upResolvedWei-Chung ChengActions
Actions #1

Updated by Jason Dillaman over 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Jason Dillaman
Actions #2

Updated by Jason Dillaman over 3 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 37286
Actions #3

Updated by Mykola Golub over 3 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #47705: nautilus: ObjectCacher with read-ahead and overwrites might result in missed wake-up added
Actions #5

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #47706: octopus: ObjectCacher with read-ahead and overwrites might result in missed wake-up added
Actions #6

Updated by Nathan Cutler over 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF