Actions
Bug #7068
closedos/FileStore.cc: 4035: FAILED assert(omap_attrs.size() == omap_aset.size()) (dumpling)
% Done:
100%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/dumpling
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2013-12-28 09:22:57.789552 7f502d93c700 -1 os/FileStore.cc: In function 'virtual int FileStore::getattrs(coll_t, const hobject_t&, std::map<std::basic_string<char>, ceph::buffer::ptr>&, bool)' thread 7f502d93c700 time 2013-12-28 09:22:57.787645 os/FileStore.cc: 4035: FAILED assert(omap_attrs.size() == omap_aset.size()) ceph version 0.67.5 (a60ac9194718083a4b6a225fc17cad6096c69bd1) 1: (FileStore::getattrs(coll_t, hobject_t const&, std::map<std::string, ceph::buffer::ptr, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::ptr> > >&, bool)+0xa3f) [0x7af44f] 2: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0x18a4) [0x610644] 3: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x6f) [0x61a36f] 4: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x35a0) [0x622630] 5: (PG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x619) [0x7120f9] 6: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x330) [0x6660a0] 7: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x4a0) [0x67c7a0] 8: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x6b7e9c] 9: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8c0296] 10: (ThreadPool::WorkThread::entry()+0x10) [0x8c20a0] 11: (()+0x7e9a) [0x7f50417f5e9a] 12: (clone()+0x6d) [0x7f503f9883fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
this was an upgrade test..
ubuntu@teuthology:/a/sage-2013-12-28_09:01:26-upgrade:upgrade-parallel-next-testing-basic-plana/17328$ cat orig.config.yaml archive_path: /var/lib/teuthworker/archive/sage-2013-12-28_09:01:26-upgrade:upgrade-parallel-next-testing-basic-plana/17328 description: upgrade/upgrade-parallel/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/snaps-few-objects.yaml 6-next-mon/monb.yaml 7-workload/rados_api_tests.yaml 8-next-mon/monc.yaml 9-workload/rados_api_tests.yaml distro/ubuntu_12.04.yaml} email: null job_id: '17328' kernel: kdb: true sha1: e2a63181d78fb157eafcdc036c7b9b00e9ac1bd7 last_in_suite: false machine_type: plana name: sage-2013-12-28_09:01:26-upgrade:upgrade-parallel-next-testing-basic-plana nuke-on-error: true os_type: ubuntu os_version: '12.04' overrides: admin_socket: branch: next ceph: conf: mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug ms: 1 debug osd: 5 log-whitelist: - slow request - wrongly marked me down - objects unfound and apparently lost - log bound mismatch sha1: 4f0784898767f40982a2aa94b35fb429d4f5965d ceph-deploy: branch: dev: next conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 install: ceph: sha1: 4f0784898767f40982a2aa94b35fb429d4f5965d s3tests: branch: next workunit: sha1: 4f0784898767f40982a2aa94b35fb429d4f5965d owner: scheduled_sage@vapre roles: - - mon.a - mon.b - mds.a - osd.0 - osd.1 - osd.2 - - osd.3 - osd.4 - osd.5 - client.0 - mon.c tasks: - chef: null - clock.check: null - install: branch: dumpling - ceph: fs: xfs - install.upgrade: osd.0: null - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - rados: clients: - client.0 objects: 50 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000 - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - workunit: branch: emperor clients: client.0: - rados/test.sh - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - workunit: branch: emperor clients: client.0: - rados/test.sh teuthology_branch: next verbose: true
Updated by Sage Weil about 10 years ago
- Status changed from New to Need More Info
Updated by Sage Weil about 10 years ago
- Severity changed from 3 - minor to 2 - major
Updated by Samuel Just about 10 years ago
- Status changed from Need More Info to Can't reproduce
Updated by Sage Weil almost 10 years ago
- Subject changed from os/FileStore.cc: 4035: FAILED assert(omap_attrs.size() == omap_aset.size()) to os/FileStore.cc: 4035: FAILED assert(omap_attrs.size() == omap_aset.size()) (dumpling)
- Status changed from Can't reproduce to 12
- Priority changed from High to Urgent
ubuntu@teuthology:/a/teuthology-2014-05-20_19:00:34-rados-dumpling-testing-basic-plana/267269
Updated by Samuel Just almost 10 years ago
- Status changed from 12 to 7
- Assignee set to Samuel Just
The bug is that the snap trimmer in dumpling does not take a lock on the object. The fix is probably backporting b87bc2311aa4da065477f402a869e2edc1558e2f, testing (wip-7068-dumpling).
Updated by Yuri Weinstein about 9 years ago
- Status changed from Resolved to New
See it again in run: http://pulpito-rdu.front.sepia.ceph.com/teuthology-2015-04-25_15:00:01-upgrade:dumpling-dumpling-distro-basic-typica/
Job: ['4529']
Logs: http://typica002.front.sepia.ceph.com/teuthology-2015-04-25_15:00:01-upgrade:dumpling-dumpling-distro-basic-typica/4529/teuthology.log
Assertion: os/FileStore.cc: 3936: FAILED assert(omap_attrs.size() == omap_aset.size()) ceph version 0.67.1 (e23b817ad0cf1ea19c0a7b7c9999b30bed37d533) 1: (FileStore::getattrs(coll_t, hobject_t const&, std::map<std::string, ceph::buffer::ptr, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::ptr> > >&, bool)+0xbae) [0x795d4e] 2: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0x28db) [0x5f354b] 3: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x23f) [0x5ff17f] 4: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3cd2) [0x6044e2] 5: (PG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x334) [0x6fe444] 6: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x343) [0x64dbf3] 7: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x19f) [0x66274f] 8: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x69e20c] 9: (ThreadPool::worker(ThreadPool::WorkThread*)+0xaf1) [0x8a4c01] 10: (ThreadPool::WorkThread::entry()+0x10) [0x8a5af0] 11: (()+0x8182) [0x7f4df5528182] 12: (clone()+0x6d) [0x7f4df343247d]
Updated by Yuri Weinstein about 9 years ago
- ceph-qa-suite upgrade/dumpling-x added
Updated by Yuri Weinstein about 9 years ago
- ceph-qa-suite upgrade/dumpling added
- ceph-qa-suite deleted (
upgrade/dumpling-x)
Updated by Samuel Just almost 9 years ago
- Status changed from New to Can't reproduce
- Regression set to No
Actions