Actions
Bug #8367
closedosd crashed in upgrade:dumpling-x:stress-split-firefly---basic-plana
Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Exceptions from ubuntu@teuthology:/a/teuthology-2014-05-14_19:55:03-upgrade:dumpling-x:stress-split-firefly---basic-plana/254850/remote/ubuntu@plana44.front.sepia.ceph.com/log/ceph-osd.2.log.gz :
1783232041: 0> 2014-05-14 22:54:53.372658 7fde03d1a700 -1 *** Caught signal (Aborted) ** 1783232123- in thread 7fde03d1a700 1783232147- 1783232148- ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) 1783232212- 1: ceph-osd() [0x98baba] 1783232238- 2: (()+0xfcb0) [0x7fde18dc9cb0] 1783232271- 3: (gsignal()+0x35) [0x7fde172c4425] 1783232309- 4: (abort()+0x17b) [0x7fde172c7b8b] 1783232346- 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fde17c1769d] 1783232416- 6: (()+0xb5846) [0x7fde17c15846] 1783232450- 7: (()+0xb5873) [0x7fde17c15873] 1783232484- 8: (()+0xb596e) [0x7fde17c1596e] 1783232518- 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0xa6d2ef] 1783232610- 10: (PG::update_snap_map(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, ObjectStore::Transaction&)+0x495) [0x75da15] 1783232744- 11: (PG::append_log(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, eversion_t, ObjectStore::Transaction&, bool)+0x4ac) [0x75dfdc] 1783232891- 12: (ReplicatedBackend::sub_op_modify(std::tr1::shared_ptr<OpRequest>)+0xaa0) [0x7dd470] 1783232981- 13: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x55c) [0x911d9c] 1783233072- 14: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1ee) [0x7beb6e] 1783233177- 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34a) [0x61ac4a] 1783233299- 16: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x628) [0x6366a8] 1783233392- 17: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x67cfac] 1783233583- 18: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa5db06] 1783233651- 19: (ThreadPool::WorkThread::entry()+0x10) [0xa5f910] 1783233706- 20: (()+0x7e9a) [0x7fde18dc1e9a] 1783233740- 21: (clone()+0x6d) [0x7fde173823fd] 1783233777- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 1783233870- 1783233871---- logging levels --- 1783233894- 0/ 5 none
2014-05-14T23:12:41.108 INFO:teuthology.task.thrashosds:joining thrashosds 2014-05-14T23:12:41.108 ERROR:teuthology.run_tasks:Manager failed: thrashosds Traceback (most recent call last): File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 92, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/home/teuthworker/teuthology-firefly/teuthology/task/thrashosds.py", line 178, in task thrash_proc.do_join() File "/home/teuthworker/teuthology-firefly/teuthology/task/ceph_manager.py", line 165, in do_join self.thread.get() File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get raise self._exception Exception: timed out waiting for admin_socket to appear after osd.2 restart 2014-05-14T23:12:41.158 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart 2014-05-14T23:12:41.159 DEBUG:teuthology.run_tasks:Unwinding manager install.upgrade 2014-05-14T23:12:41.159 DEBUG:teuthology.run_tasks:Unwinding manager ceph 2014-05-14T23:12:41.159 DEBUG:teuthology.orchestra.run:Running [10.214.132.34]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format json' 2014-05-14T23:12:41.593 INFO:teuthology.orchestra.run.err:[10.214.132.34]: dumped all in format json 2014-05-14T23:12:42.638 INFO:teuthology.task.ceph:Scrubbing osd osd.0 2014-05-14T23:12:42.638 DEBUG:teuthology.orchestra.run:Running [10.214.132.34]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd scrub osd.0' 2014-05-14T23:12:43.038 INFO:teuthology.orchestra.run.err:[10.214.132.34]: osd.0 instructed to scrub 2014-05-14T23:12:43.050 INFO:teuthology.task.ceph:Scrubbing osd osd.1 2014-05-14T23:12:43.050 DEBUG:teuthology.orchestra.run:Running [10.214.132.34]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd scrub osd.1' 2014-05-14T23:12:43.281 INFO:teuthology.orchestra.run.err:[10.214.132.34]: osd.1 instructed to scrub 2014-05-14T23:12:43.292 INFO:teuthology.task.ceph:Scrubbing osd osd.2 2014-05-14T23:12:43.292 DEBUG:teuthology.orchestra.run:Running [10.214.132.34]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd scrub osd.2' 2014-05-14T23:12:43.523 INFO:teuthology.orchestra.run.err:[10.214.132.34]: Error EAGAIN: osd.2 is not up 2014-05-14T23:12:43.535 ERROR:teuthology.contextutil:Saw exception from nested tasks Traceback (most recent call last): File "/home/teuthworker/teuthology-firefly/teuthology/contextutil.py", line 29, in nested yield vars File "/home/teuthworker/teuthology-firefly/teuthology/task/ceph.py", line 1458, in task osd_scrub_pgs(ctx, config) File "/home/teuthworker/teuthology-firefly/teuthology/task/ceph.py", line 1090, in osd_scrub_pgs 'ceph', 'osd', 'scrub', role]) File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/remote.py", line 106, in run r = self._runner(client=self.ssh, **kwargs) File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 330, in run r.exitstatus = _check_status(r.exitstatus) File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 326, in _check_status raise CommandFailedError(command=r.command, exitstatus=status, node=host) CommandFailedError: Command failed on 10.214.132.34 with status 11: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd scrub osd.2' 2014-05-14T23:12:43.562 INFO:teuthology.misc:Shutting down mds daemons... 2014-05-14T23:12:43.563 DEBUG:teuthology.task.ceph.mds.a:waiting for process to exit 2014-05-14T23:12:44.959 INFO:teuthology.task.ceph.mds.a:Stopped 2014-05-14T23:12:44.959 INFO:teuthology.misc:Shutting down osd daemons... 2014-05-14T23:12:44.960 DEBUG:teuthology.task.ceph.osd.1:waiting for process to exit 2014-05-14T23:12:44.979 INFO:teuthology.task.ceph.osd.1:Stopped 2014-05-14T23:12:44.980 DEBUG:teuthology.task.ceph.osd.0:waiting for process to exit 2014-05-14T23:12:45.039 INFO:teuthology.task.ceph.osd.0:Stopped 2014-05-14T23:12:45.039 DEBUG:teuthology.task.ceph.osd.3:waiting for process to exit 2014-05-14T23:12:45.059 INFO:teuthology.task.ceph.osd.3:Stopped 2014-05-14T23:12:45.059 DEBUG:teuthology.task.ceph.osd.2:waiting for process to exit 2014-05-14T23:12:45.059 ERROR:teuthology.misc:Saw exception from osd.2 Traceback (most recent call last): File "/home/teuthworker/teuthology-firefly/teuthology/misc.py", line 1128, in stop_daemons_of_type daemon.stop() File "/home/teuthworker/teuthology-firefly/teuthology/task/ceph.py", line 57, in stop run.wait([self.proc]) File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 356, in wait proc.exitstatus.get() File "/usr/lib/python2.7/dist-packages/gevent/event.py", line 207, in get raise self._exception CommandFailedError: Command failed on 10.214.132.34 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 2' 2014-05-14T23:12:45.071 DEBUG:teuthology.task.ceph.osd.5:waiting for process to exit
archive_path: /var/lib/teuthworker/archive/teuthology-2014-05-14_19:55:03-upgrade:dumpling-x:stress-split-firefly---basic-plana/254850 branch: firefly description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/readwrite.yaml 6-next-mon/monb.yaml 7-workload/rbd_api.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_14.04.yaml} email: null job_id: '254850' last_in_suite: false machine_type: plana name: teuthology-2014-05-14_19:55:03-upgrade:dumpling-x:stress-split-firefly---basic-plana nuke-on-error: true os_type: ubuntu os_version: '14.04' overrides: admin_socket: branch: firefly ceph: conf: mon: debug mon: 20 debug ms: 1 debug paxos: 20 mon warn on legacy crush tunables: false osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 log-whitelist: - slow request - wrongly marked me down - objects unfound and apparently lost - log bound mismatch sha1: a38fe1169b6d2ac98b427334c12d7cf81f809b74 ceph-deploy: branch: dev: firefly conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: a38fe1169b6d2ac98b427334c12d7cf81f809b74 s3tests: branch: master workunit: sha1: a38fe1169b6d2ac98b427334c12d7cf81f809b74 owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.b - mds.a - osd.0 - osd.1 - osd.2 - - osd.3 - osd.4 - osd.5 - mon.c - - client.0 suite: upgrade:dumpling-x:stress-split targets: ubuntu@plana44.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCuAuZz+2oyq/3xquKfwZzdK3TBJGelKO4bQ9KIbqRmy2GCqT80FVXC59ynpd7buoY8sqDdEvjF6+E/OowXVg1kIN3uNGntXVMZQc1b89O7i4LkaUVwS4QBT/m5h49nAjxem4Jyq11iNOM06G4NFHZeRHuHZupfz0sj3W0qIB/fBOT7MX48Iwpak7gbvWn1gTzAP42vweSp/9cAamb6IWPiKUqTMoDmFiYCQlKkfhovIjDBgeKsh/9umSi1qYLGrCOpq9ZSgV7OVga/H27odrFIGc4IAXY7t4kLobixODboLSbhQIVkLxW6FQxyWN4MBnJHSeVz+UO8RzigqLUgVFEJ ubuntu@plana90.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQD4N7cFcxX6jr5S1YNRUVfhy/Zm8RJMn7qF6T1hvbrHokB9LFktHd1AGxBLVVJeedpcWcmx0tiudcoILCW6LCTOv1r6Ne5bOoCeLdJI/DBvezHfzj7e740zBir5IxGuMwV5c9vLn2JPTGQtlLLDjbaqT/7ghFmEuPQeqRDu2BIB2K/46+XFXmiryVss3+cHWW5ApS2p4za7MpKKVPsq5HPvFnNJcAh50AotOJHaTx7BNt7RSMaB4tPt6nOr0mddjG/pA0bDk+r6Xtrrq/zHx4ArcvsGu2wzzHMLmmNVhq6vG88iUPItqdiE588O1CjietucjF6cGbm2QNW/J3tcUB3R ubuntu@plana92.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9RMgEoaLLIr68XvX+KvTsR8kcn+XvJEt8tjNTMK8T6TjE3LzajFS8jQTne260jPurEjRRnsICio9Sb6odFuZiiErQB7/p3fa/Rgs1PpJpmQ6HluQrfmiq4OY9t8u/OcvkJiVW9CXLfzKQhZDfV1ifrMz73FI1wD11oC77qMOVFF1jCH5XZ9vHoWd+xOwBRRglAKWeJQJNJcSk+bJFs/i65BuLEDMis8b7FeLPrM+qXCasCh+Aimb0Ny6/+izsrxjsIJ/VGFT24USakTJZY/MAYPHdHhB6cG2xXAJBp1P3npaZZHje+2rk2Co02lmpZjRX+p3btuYNoffultMtX2Xl tasks: - internal.lock_machines: - 3 - plana - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.serialize_remote_roles: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: branch: dumpling - ceph: fs: xfs - install.upgrade: osd.0: null - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 thrash_primary_affinity: false timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - rados: clients: - client.0 objects: 500 op_weights: delete: 10 read: 45 write: 45 ops: 4000 - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - workunit: branch: dumpling clients: client.0: - rbd/test_librbd.sh - install.upgrade: mon.c: null - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - workunit: branch: dumpling clients: client.0: - rados/test-upgrade-firefly.sh - workunit: branch: dumpling clients: client.0: - rbd/test_librbd_python.sh - rgw: client.0: idle_timeout: 300 - swift: client.0: rgw_server: client.0 - rados: clients: - client.0 objects: 500 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000 teuthology_branch: firefly verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.plana.19245
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml 2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/readwrite.yaml 6-next-mon/monb.yaml 7-workload/rbd_api.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_14.04.yaml} duration: 4766.220067024231 failure_reason: timed out waiting for admin_socket to appear after osd.2 restart flavor: basic owner: scheduled_teuthology@teuthology success: false
Actions