Bug #7122
push 0/hit_set_... v 0'0 failed because local copy is 818'7131
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
-2> 2014-01-08 23:06:44.830533 7f70979af700 7 osd.0 pg_epoch: 818 pg[4.0( v 818'7131 (390'4060,818'7131] local-les=814 n=155 ec=7 les/c 814/759 809/813/813) [4,0]/[0,3,4] r=0 lpr=813 pi=731-812/2 rops=1 bft=4 lcod 808'7130 mlcod 0'0 active+remapped+backfilling] send_push_op 0/hit_set_4.0_archive_2014-01-08 23:06:37.006160_2014-01-08 23:06:44.156127/he ad/.ceph-internal/4 v 0'0 size 0 recovery_info: ObjectRecoveryInfo(0/hit_set_4.0_archive_2014-01-08 23:06:37.006160_2014-01-08 23:06:44.156127/head/.ceph-internal/4@0'0, copy_subs et: [], clone_subset: {}) -1> 2014-01-08 23:06:44.830654 7f70979af700 0 log [ERR] : 4.0 push 0/hit_set_4.0_archive_2014-01-08 23:06:37.006160_2014-01-08 23:06:44.156127/head/.ceph-internal/4 v 0'0 fai led because local copy is 818'7131 0> 2014-01-08 23:06:44.843891 7f70979af700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedBackend::prep_push(ObjectContextRef, const hobject_t&, int, eversion_t, interva l_set<long unsigned int>&, std::map<hobject_t, interval_set<long unsigned int> >&, PushOp*)' thread 7f70979af700 time 2014-01-08 23:06:44.830661 osd/ReplicatedPG.cc: 7172: FAILED assert(r == 0) ceph version 0.73-762-g1f6ddcf (1f6ddcf8cd8cc0087773bc9d197ddfbf12e2adec) 1: (ReplicatedBackend::prep_push(std::tr1::shared_ptr<ObjectContext>, hobject_t const&, int, eversion_t, interval_set<unsigned long>&, std::map<hobject_t, interval_set<unsigned l ong>, std::less<hobject_t>, std::allocator<std::pair<hobject_t const, interval_set<unsigned long> > > >&, PushOp*)+0x2e2) [0x7c7ae2] 2: (ReplicatedBackend::prep_push_to_replica(std::tr1::shared_ptr<ObjectContext>, hobject_t const&, int, PushOp*)+0x378) [0x7c8038] 3: (ReplicatedBackend::start_pushes(hobject_t const&, std::tr1::shared_ptr<ObjectContext>, ReplicatedBackend::RPGHandle*)+0x588) [0x7c8cf8] 4: (ReplicatedBackend::recover_object(hobject_t const&, std::tr1::shared_ptr<ObjectContext>, std::tr1::shared_ptr<ObjectContext>, PGBackend::RecoveryHandle*)+0xeb) [0x80fb3b] 5: (ReplicatedPG::prep_backfill_object_push(hobject_t, eversion_t, eversion_t, std::tr1::shared_ptr<ObjectContext>, int, PGBackend::RecoveryHandle*)+0x586) [0x795936] 6: (ReplicatedPG::recover_backfill(int, ThreadPool::TPHandle&, bool*)+0x2171) [0x7b4c71] 7: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*, ThreadPool::TPHandle&, int*)+0x6cd) [0x7ccf2d] 8: (OSD::do_recovery(PG*, ThreadPool::TPHandle&)+0x1c0) [0x627370] 9: (OSD::RecoveryWQ::_process(PG*, ThreadPool::TPHandle&)+0x11) [0x671e61] 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x98c376] 11: (ThreadPool::WorkThread::entry()+0x10) [0x98e180] 12: (()+0x7e9a) [0x7f70ad04be9a] 13: (clone()+0x6d) [0x7f70ab60c3fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-01-08_22:39:23-rados:thrash-wip-cache-snap-testing-basic-plana/30476
Associated revisions
osd/ReplicatedPG: update ObjectContext's object_info_t for new hit_set objects
We were fabricating an object_info_t correctly and writing it to disk, but
it was not reflected by the in-memory ObjectContext. If something came
along quickly (like backfill) and tried to use it, the info would be
invalid.
Fix this by fabricating it in the obc and copying it to the new_obs for
the update.
Fixes: #7122
Signed-off-by: Sage Weil <sage@inktank.com>
History
#1 Updated by Sage Weil over 9 years ago
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-01-08_22:39:23-rados:thrash-wip-cache-snap-testing-basic-plana/30490
#2 Updated by Sage Weil over 9 years ago
- Status changed from New to 7
#3 Updated by Sage Weil over 9 years ago
- Status changed from 7 to Resolved