Project

General

Profile

Actions

Bug #7892

closed

osd/ReplicatedPG.cc: 7881: FAILED assert((data_included.empty() && data.length() == 0) || (!data_included.empty() && data.length() > 0))

Added by Sage Weil about 10 years ago. Updated about 10 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
David Zafman
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

     0> 2014-03-27 19:25:51.602985 7f2576f3d700 -1 osd/ReplicatedPG.cc: In function 'bool ReplicatedBackend::handle_pull_response(pg_shard_t, PushOp&, PullOp*, std::list<hobject_t>*, ObjectStore::Transaction*)' thread 7f2576f3d700 time 2014-03-27 19:25:51.589689
osd/ReplicatedPG.cc: 7881: FAILED assert((data_included.empty() && data.length() == 0) || (!data_included.empty() && data.length() > 0))

 ceph version 0.78-367-gd9a2dea (d9a2dea755a62e4f9fe0795410f37b68a15ae054)
 1: (ReplicatedBackend::handle_pull_response(pg_shard_t, PushOp&, PullOp*, std::list<hobject_t, std::allocator<hobject_t> >*, ObjectStore::Transaction*)+0x50e) [0x81b0ce]
 2: (ReplicatedBackend::_do_pull_response(std::tr1::shared_ptr<OpRequest>)+0x321) [0x81c9f1]
 3: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x321) [0x907161]
 4: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1ee) [0x7bb84e]
 5: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34a) [0x619b7a]
 6: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x628) [0x634bd8]
 7: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x67a20c]
 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa52376]
 9: (ThreadPool::WorkThread::entry()+0x10) [0xa54180]
 10: (()+0x7e9a) [0x7f258c878e9a]
 11: (clone()+0x6d) [0x7f258ae39ccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-27_02:30:16-rados-firefly-distro-basic-plana/149028

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #8091: osd/SnapMapper.cc: 217: FAILED assert(r == -2)ResolvedSamuel Just04/13/2014

Actions
Actions #1

Updated by Sage Weil about 10 years ago

  • Status changed from New to Duplicate

probably dups #7916

Actions #2

Updated by Sage Weil about 10 years ago

  • Status changed from Duplicate to New
  • Assignee set to David Zafman
Actions #3

Updated by Sage Weil about 10 years ago

ubuntu@teuthology:/a/teuthology-2014-04-03_02:30:03-rados-firefly-distro-basic-plana

Actions #4

Updated by Sage Weil about 10 years ago

ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-04-13_09:43:35-rados:thrash-testing-testing-basic-plana/189080

Actions #5

Updated by David Zafman about 10 years ago

  • Status changed from New to Duplicate

There were 2 identical crashes. This is the trace of one of them:

object: cecc4d22/plana9117053-25/8d//3
pg: 3.2
osdmap: [3,5] active+clean to [5,0] active+recovering to [0,1] active+recovering
osd.0 decided to pull from osd.3 because it hadn't gotten the object data during the brief time that osd.5 was primary
but the snapset was empty:
2014-04-13 11:45:53.780739 7f8054c63700 10 osd.0 pg_epoch: 171 pg[3.2( v 169'159 lc 151'147 (0'0,169'159] local-les=171 n=7 ec=6 les/c 171/161 170/170/170) [0,1] r=0 lpr=170 pi=135-169/3 rops=4 crt=163'154 mlcod 0'0 active+recovering m=3 snaptrimq=[73~1,7a~1,84~1,8a~1,8c~1,8f~1,92~1,95~1]] snapset 0=[]:[]

osd.3 didn't even bother reading the file because of the pull request from osd.0:
2014-04-13 11:45:53.826392 7f8bb8e78700 7 osd.3 pg_epoch: 172 pg[3.2( v 163'154 (0'0,163'154] local-les=161 n=6 ec=6 les/c 161/161 170/170/170) [0,1] r=-1 lpr=170 pi=160-169/2 crt=102'89 lcod 156'151 inactive NOTIFY] send_push_op cecc4d22/plana9117053-25/8d//3 v 163'152 size 0 recovery_info: ObjectRecoveryInfo(cecc4d22/plana9117053-25/8d//3@163'152, copy_subset: [], clone_subset: {})
2014-04-13 11:45:53.826408 7f8bb8e78700 15 filestore(/var/lib/ceph/osd/ceph-3) omap_get_header 3.2_head/cecc4d22/plana9117053-25/8d//3
2014-04-13 11:45:53.826461 7f8bb8e78700 15 filestore(/var/lib/ceph/osd/ceph-3) getattrs 3.2_head/cecc4d22/plana9117053-25/8d//3
2014-04-13 11:45:53.826495 7f8bb8e78700 20 filestore(/var/lib/ceph/osd/ceph-3) fgetattrs 92 getting '_'
2014-04-13 11:45:53.826508 7f8bb8e78700 20 filestore(/var/lib/ceph/osd/ceph-3) fgetattrs 92 getting '__header'
2014-04-13 11:45:53.826544 7f8bb8e78700 10 filestore(/var/lib/ceph/osd/ceph-3) getattrs 3.2_head/cecc4d22/plana9117053-25/8d//3 = 0
2014-04-13 11:45:53.826561 7f8bb8e78700 15 filestore(/var/lib/ceph/osd/ceph-3) get_omap_iterator 3.2_head/cecc4d22/plana9117053-25/8d//3
2014-04-13 11:45:53.826606 7f8bb8e78700 20 osd.3 pg_epoch: 172 pg[3.2( v 163'154 (0'0,163'154] local-les=161 n=6 ec=6 les/c 161/161 170/170/170) [0,1] r=-1 lpr=170 pi=160-169/2 crt=102'89 lcod 156'151 inactive NOTIFY] send_pushes: sending push PushOp(cecc4d22/plana9117053-25/8d//3, version: 163'152, data_included: [], data_size: 0, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(cecc4d22/plana9117053-25/8d//3@163'152, copy_subset: [], clone_subset: {}), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, data_complete:true, omap_recovered_to:, omap_complete:true), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false)) to osd.0

Actions

Also available in: Atom PDF