Actions
Bug #5631
closedosd/ReplicatedPG.cc: 3036: FAILED assert(iter)
Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
0> 2013-07-15 02:19:36.389077 7f20138b8700 -1 osd/ReplicatedPG.cc: In function 'int ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp>&)' thread 7f20138b8700 time 2013-07-15 02:19:36.387113 osd/ReplicatedPG.cc: 3036: FAILED assert(iter) ceph version 0.66-588-g9baa668 (9baa66801ab02854c344eb2fd1a8da8c5806125b) 1: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0x92ab) [0x61550b] 2: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x6f) [0x61781f] 3: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3590) [0x61fb40] 4: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x619) [0x70b189] 5: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>)+0x323) [0x662373] 6: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>)+0x49b) [0x67855b] 7: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x31) [0x6b2ec1] 8: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x6b322c] 9: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8b8576] 10: (ThreadPool::WorkThread::entry()+0x10) [0x8ba3a0] 11: (()+0x7e9a) [0x7f2026253e9a] 12: (clone()+0x6d) [0x7f20243e6ccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
job was
ubuntu@teuthology:/a/teuthology-2013-07-15_01:00:16-rados-next-testing-basic/67493$ cat orig.config.yaml kernel: kdb: true sha1: 365b57b1317524bb0cdd15859a224ba1ab58d1d7 machine_type: plana nuke-on-error: true overrides: admin_socket: branch: next ceph: conf: global: ms inject delay max: 1 ms inject delay probability: 0.005 ms inject delay type: osd ms inject socket failures: 2500 mon: debug mon: 20 debug ms: 20 debug paxos: 20 fs: xfs log-whitelist: - slow request sha1: 9baa66801ab02854c344eb2fd1a8da8c5806125b install: ceph: sha1: 9baa66801ab02854c344eb2fd1a8da8c5806125b s3tests: branch: next workunit: sha1: 9baa66801ab02854c344eb2fd1a8da8c5806125b roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.0 tasks: - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - rados: clients: - client.0 objects: 50 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000
Updated by Samuel Just almost 11 years ago
- Status changed from New to In Progress
- Assignee set to Samuel Just
Updated by Samuel Just almost 11 years ago
- Status changed from In Progress to 7
get_omap_iterator relies on lfn_find, while getattr relies on lfn_open. The latter might return attrs from an hobject_t residing in a parent collection if the 5269 bug occurred leaving get_omap_iterator to fail to find an object at the correct path. This one is probably also due to the 5269 bug.
Actions