Actions
Bug #5677
closedosd/OSD.cc: 5517: FAILED assert(_get_map_bl(epoch, bl))
% Done:
0%
Source:
Q/A
Tags:
Backport:
cuttlefish
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
-2> 2013-07-19 03:36:48.647677 7f9a4e55c700 1 -- 10.214.133.28:6804/3750 <== mon.0 10.214.132.36:6789/0 11 ==== osd_map(2090..2090 src has 2090..2839) v3 ==== 5285+0+0 (1231217973 0 0) 0x230a240 con 0x1c3b2c0 -1> 2013-07-19 03:36:48.647713 7f9a4e55c700 3 osd.0 1807 handle_osd_map epochs [2090,2090], i have 1807, src has [2090,2839] 0> 2013-07-19 03:36:48.650527 7f9a4e55c700 -1 osd/OSD.cc: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f9a4e55c700 time 2013-07-19 03:36:48.648065 osd/OSD.cc: 5517: FAILED assert(_get_map_bl(epoch, bl)) ceph version 0.66-712-gc9ba933 (c9ba933b0b2fdb012ccf8a8535d09381c943144d) 1: (OSDService::get_map(unsigned int)+0x428) [0x694978] 2: (OSDService::init_splits_between(pg_t, std::tr1::shared_ptr<OSDMap const>, std::tr1::shared_ptr<OSDMap const>)+0x1c9) [0x69c319] 3: (OSD::consume_map()+0x5ec) [0x69cc0c] 4: (OSD::handle_osd_map(MOSDMap*)+0x101f) [0x6a8aaf] 5: (OSD::_dispatch(Message*)+0x2fb) [0x6ab48b] 6: (OSD::ms_dispatch(Message*)+0x1d6) [0x6abb96] 7: (DispatchQueue::entry()+0x549) [0x9898f9] 8: (DispatchQueue::DispatchThread::entry()+0xd) [0x8adedd] 9: (()+0x7e9a) [0x7f9a5aeebe9a] 10: (clone()+0x6d) [0x7f9a5907eccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
job was
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-07-19_01:00:14-rados-next-testing-basic/72838$ cat orig.config.yaml kernel: kdb: true sha1: 77c8bf2f972a9d6ff446c49a41678bf931bbee44 machine_type: plana nuke-on-error: true overrides: admin_socket: branch: next ceph: conf: global: ms inject delay max: 1 ms inject delay probability: 0.005 ms inject delay type: osd ms inject internal delays: 0.002 ms inject socket failures: 2500 mon: debug mon: 20 debug ms: 20 debug paxos: 20 fs: xfs log-whitelist: - slow request sha1: c9ba933b0b2fdb012ccf8a8535d09381c943144d ceph-deploy: conf: client: debug monc: 20 debug ms: 1 debug objecter: 20 debug rados: 20 log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 20 debug ms: 20 debug paxos: 20 install: ceph: sha1: c9ba933b0b2fdb012ccf8a8535d09381c943144d s3tests: branch: next workunit: sha1: c9ba933b0b2fdb012ccf8a8535d09381c943144d roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.0 tasks: - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - rados: clients: - client.0 objects: 50 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000 teuthology_branch: next
Updated by Samuel Just almost 11 years ago
Fix merged, 6951d2345a5d837c3b14103bd4d8f5ee4407c937, still working on getting the test to be reliable so I can add it to the suite.
Updated by Samuel Just almost 11 years ago
- Status changed from In Progress to Fix Under Review
Added wip-5677 to ceph-qa-suite and teuthology gits.
Updated by Sage Weil almost 11 years ago
Samuel Just wrote:
Added wip-5677 to ceph-qa-suite and teuthology gits.
for the teuthology.git change, let's have a non-zero probability of running this or else we'll forget to run it when it is important to do so. also, let's set the mon min osdmap epochs = 25 or something similarly small in the ceph.conf.template?
with that there probably isn't a need make any qa suite changes?
Updated by Samuel Just almost 11 years ago
- Status changed from Fix Under Review to Resolved
Updated by Ian Colle over 10 years ago
- Backport changed from cuttlefish, dumpling to cuttlefish
Actions