Bug #8425
Ceph - Bug #7068: os/FileStore.cc: 4035: FAILED assert(omap_attrs.size() == omap_aset.size()) (dumpling)
osd crashed in rados-dumpling-testing-basic-plana suite
Status:
Duplicate
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
coredump info @ ubuntu@teuthology:/a/teuthology-2014-05-20_19:00:34-rados-dumpling-testing-basic-plana/267269/remote/ubuntu@plana14.front.sepia.ceph.com/log$ceph-osd.3.log.gz :
ceph-osd.3.log.gz:835407114- -6> 2014-05-21 19:36:27.016800 7fb451ee7700 20 journal write_thread_entry going to sleep ceph-osd.3.log.gz:835407207- -5> 2014-05-21 19:36:27.016801 7fb44f6e2700 5 filestore(/var/lib/ceph/osd/ceph-3) queue_op 0x2d6aa50 seq 58784 osr(3.14 0x25c2a70) 1196 bytes (queue has 50 ops and 8215300 bytes) ceph-osd.3.log.gz:835407393- -4> 2014-05-21 19:36:27.016804 7fb44f6e2700 5 filestore(/var/lib/ceph/osd/ceph-3) _journaled_ahead 0x2d6acd0 seq 58785 osr(3.15 0x32154d0) 0x37ba900 ceph-osd.3.log.gz:835407547- -3> 2014-05-21 19:36:27.016807 7fb44f6e2700 5 filestore(/var/lib/ceph/osd/ceph-3) queue_op 0x2d6acd0 seq 58785 osr(3.15 0x32154d0) 1196 bytes (queue has 50 ops and 8215300 bytes) ceph-osd.3.log.gz:835407733- -2> 2014-05-21 19:36:27.016810 7fb44f6e2700 5 filestore(/var/lib/ceph/osd/ceph-3) _journaled_ahead 0x2d6abe0 seq 58786 osr(3.15 0x32154d0) 0x37ba300 ceph-osd.3.log.gz:835407887- -1> 2014-05-21 19:36:27.016812 7fb44f6e2700 5 filestore(/var/lib/ceph/osd/ceph-3) queue_op 0x2d6abe0 seq 58786 osr(3.15 0x32154d0) 1196 bytes (queue has 50 ops and 8215300 bytes) ceph-osd.3.log.gz:835408073: 0> 2014-05-21 19:36:27.051066 7fb4459b4700 -1 *** Caught signal (Aborted) ** ceph-osd.3.log.gz:835408155- in thread 7fb4459b4700 ceph-osd.3.log.gz:835408179- ceph-osd.3.log.gz:835408180- ceph version 0.67.8-15-gb638d19 (b638d19d126646d2a8f6da11067c5f392a62525e) ceph-osd.3.log.gz:835408256- 1: ceph-osd() [0x7fe46a] ceph-osd.3.log.gz:835408282- 2: (()+0xfcb0) [0x7fb458ac0cb0] ceph-osd.3.log.gz:835408315- 3: (gsignal()+0x35) [0x7fb456b46425] ceph-osd.3.log.gz:835408353- 4: (abort()+0x17b) [0x7fb456b49b8b] ceph-osd.3.log.gz:835408390- 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fb45749969d] ceph-osd.3.log.gz:835408460- 6: (()+0xb5846) [0x7fb457497846] ceph-osd.3.log.gz:835408494- 7: (()+0xb5873) [0x7fb457497873] ceph-osd.3.log.gz:835408528- 8: (()+0xb596e) [0x7fb45749796e] ceph-osd.3.log.gz:835408562- 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0x8c841f] ceph-osd.3.log.gz:835408654- 10: (FileStore::getattrs(coll_t, hobject_t const&, std::map<std::string, ceph::buffer::ptr, std::less<std::string>, std::allocator<std::pair<std::string const, ceph::buffer::ptr> > >&, bool)+0xbde) [0x7a735e] ceph-osd.3.log.gz:835408864- 11: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0x18a4) [0x606ac4] ceph-osd.3.log.gz:835408985- 12: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x6f) [0x6107ff] ceph-osd.3.log.gz:835409068- 13: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x35a0) [0x618ac0] ceph-osd.3.log.gz:835409146- 14: (PG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x619) [0x7098a9] ceph-osd.3.log.gz:835409241- 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x330) [0x65cd90] ceph-osd.3.log.gz:835409363- 16: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x478) [0x673828] ceph-osd.3.log.gz:835409456- 17: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x6aee1c] ceph-osd.3.log.gz:835409647- 18: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8b8c56] ceph-osd.3.log.gz:835409715- 19: (ThreadPool::WorkThread::entry()+0x10) [0x8baa60] ceph-osd.3.log.gz:835409770- 20: (()+0x7e9a) [0x7fb458ab8e9a] ceph-osd.3.log.gz:835409804- 21: (clone()+0x6d) [0x7fb456c043fd] ceph-osd.3.log.gz:835409841- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. ceph-osd.3.log.gz:835409934- ceph-osd.3.log.gz:835409935---- logging levels --- ceph-osd.3.log.gz:835409958- 0/ 5 none ceph-osd.3.log.gz:835409971- 0/ 1 lockdep ceph-osd.3.log.gz:835409987- 0/ 1 context
2014-05-21T19:37:51.050 ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x2b67f90> Traceback (most recent call last): File "/home/teuthworker/teuthology-dumpling/teuthology/run_tasks.py", line 45, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/home/teuthworker/teuthology-dumpling/teuthology/task/thrashosds.py", line 167, in task thrash_proc.do_join() File "/home/teuthworker/teuthology-dumpling/teuthology/task/ceph_manager.py", line 106, in do_join self.thread.get() File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 331, in get raise self._exception Exception: timed out waiting for admin_socket to appear after osd.3 restart 2014-05-21T19:37:51.186 DEBUG:teuthology.run_tasks:Unwinding manager <contextlib.GeneratorContextManager object at 0x2aaa610> 2014-05-21T19:37:51.186 ERROR:teuthology.contextutil:Saw exception from nested tasks Traceback (most recent call last): File "/home/teuthworker/teuthology-dumpling/teuthology/contextutil.py", line 27, in nested yield vars File "/home/teuthworker/teuthology-dumpling/teuthology/task/ceph.py", line 1168, in task yield File "/home/teuthworker/teuthology-dumpling/teuthology/run_tasks.py", line 45, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/home/teuthworker/teuthology-dumpling/teuthology/task/thrashosds.py", line 167, in task thrash_proc.do_join() File "/home/teuthworker/teuthology-dumpling/teuthology/task/ceph_manager.py", line 106, in do_join self.thread.get() File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 331, in get raise self._exception Exception: timed out waiting for admin_socket to appear after osd.3 restart 2014-05-21T19:37:51.280 INFO:teuthology.misc:Shutting down mds daemons... 2014-05-21T19:37:51.281 DEBUG:teuthology.task.ceph.mds.a:waiting for process to exit 2014-05-21T19:37:51.288 INFO:teuthology.task.ceph.mds.a:Stopped 2014-05-21T19:37:51.288 INFO:teuthology.misc:Shutting down osd daemons... 2014-05-21T19:37:51.288 DEBUG:teuthology.task.ceph.osd.1:waiting for process to exit 2014-05-21T19:37:51.307 INFO:teuthology.task.ceph.osd.1:Stopped 2014-05-21T19:37:51.307 DEBUG:teuthology.task.ceph.osd.0:waiting for process to exit 2014-05-21T19:37:51.359 INFO:teuthology.task.ceph.osd.0:Stopped 2014-05-21T19:37:51.359 DEBUG:teuthology.task.ceph.osd.3:waiting for process to exit 2014-05-21T19:37:51.359 ERROR:teuthology.misc:Saw exception from osd.3 Traceback (most recent call last): File "/home/teuthworker/teuthology-dumpling/teuthology/misc.py", line 831, in stop_daemons_of_type daemon.stop() File "/home/teuthworker/teuthology-dumpling/teuthology/task/ceph.py", line 36, in stop run.wait([self.proc]) File "/home/teuthworker/teuthology-dumpling/teuthology/orchestra/run.py", line 282, in wait proc.exitstatus.get() File "/usr/lib/python2.7/dist-packages/gevent/event.py", line 207, in get raise self._exception CommandFailedError: Command failed on 10.214.131.26 with status 1: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage sudo /home/ubuntu/cephtest/daemon-helper kill ceph-osd -f -i 3' 2014-05-21T19:37:51.385 DEBUG:teuthology.task.ceph.osd.2:waiting for process to exit 2014-05-21T19:37:51.412 INFO:teuthology.task.ceph.osd.2:Stopped 2014-05-21T19:37:51.412 ERROR:teuthology.task.ceph.osd.5:tried to stop a non-running daemon 2014-05-21T19:37:51.412 ERROR:teuthology.task.ceph.osd.4:tried to stop a non-running daemon
archive_path: /var/lib/teuthworker/archive/teuthology-2014-05-20_19:00:34-rados-dumpling-testing-basic-plana/267269 branch: dumpling description: rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/few.yaml thrashers/default.yaml workloads/snaps-few-objects.yaml} email: null job_id: '267269' kernel: &id001 kdb: true sha1: 335cb91ce950ce0e12294af671c64a468d89194c last_in_suite: false machine_type: plana name: teuthology-2014-05-20_19:00:34-rados-dumpling-testing-basic-plana nuke-on-error: true os_type: ubuntu overrides: admin_socket: branch: dumpling ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 fs: xfs log-whitelist: - slow request sha1: b638d19d126646d2a8f6da11067c5f392a62525e ceph-deploy: branch: dev: dumpling conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: b638d19d126646d2a8f6da11067c5f392a62525e s3tests: branch: dumpling workunit: sha1: b638d19d126646d2a8f6da11067c5f392a62525e owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.0 suite: rados targets: ubuntu@plana14.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCkQBfSVIjF3fRKmdHm4NeHMWyt0vTcQFKvQISTEmBNe/DVwHRjLnsV0Cx5LsWjUNlVCXkHEYidQbhvQQ6KuWRs1EbRQWYjYOdv/EZh+uxqik8zBPV3L7hC3O34GiUng+EEaX22ta804jrsIZyoeBFa6r9jnKcc4Uk5eCafw3RVUEKILnYrAFOGWrTpXzoZGljuG6GujV7kKrnOvCyAfY4PdSRdQY1j4QOyvkr1zix9iBctAVaJG+xa/Ff48Oa7mqxioQPtUcn03ZOxqkQ70U24URjqPMZl3JD7aHG/Riz/DSvbeESQftHWVmqp03WIm9qTIgzClEcPbhKbLUNVewh7 ubuntu@plana35.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDN2DvVrroCqjz4NeYHoqEhj35Ea5bEEMc2YHJY0rfXO0J9cRSnQBfzOQtSC6LrMYXJrQ4Wr/UQmmovs6a42F/fVmYygUeCdAjy6RT5bQ7eHdd+bJB3zKSZ+oY72g8UHlHRekSFBDI8ivFLC7PWscUx4v1o5vgQFgUDXkBjZpje7VpB3Sp8P/Dpqf5zcEo/9DxYxntaxn3dg2LrGZ4jKG5A49ivCo6NOxFRyth3dlcTMlhPjVs6dKWzohppx9h6bW5k0ijF3FLfPfc72Nkdjiw8W3njDbXl8JjKTvMrfKc1wxDEeq7602oIuT7CFC4hrBZtxLPinrtQhguXB0j35git tasks: - internal.lock_machines: - 2 - plana - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - rados: clients: - client.0 objects: 50 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000 teuthology_branch: dumpling verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.plana.17189
description: rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/few.yaml thrashers/default.yaml workloads/snaps-few-objects.yaml} duration: 1520.7286908626556 failure_reason: timed out waiting for admin_socket to appear after osd.3 restart flavor: basic mon.a-kernel-sha1: 335cb91ce950ce0e12294af671c64a468d89194c mon.b-kernel-sha1: 335cb91ce950ce0e12294af671c64a468d89194c owner: scheduled_teuthology@teuthology success: false
History
#1 Updated by Tamilarasi muthamizhan almost 10 years ago
- Status changed from New to Duplicate
- Parent task set to #7068
#2 Updated by Yuri Weinstein almost 10 years ago
Same problem in http://qa-proxy.ceph.com/teuthology/teuthology-2014-05-21_19:55:29-upgrade:dumpling-x:stress-split-firefly---basic-plana/269966/
ceph version 0.67.9 (ba340a97c3dafc9155023da8d515eecc675c619a)