Actions
Bug #8393
closedosd crashed in rbd-master-testing-basic-plana suite
Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-05-18_23:00:04-rbd-master-testing-basic-plana/261772/
2014-05-19T07:47:52.190 INFO:teuthology.orchestra.run.plana59:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd scrub osd.4' 2014-05-19T07:47:52.457 INFO:teuthology.orchestra.run.plana59.stderr:Error EAGAIN: osd.4 is not up 2014-05-19T07:47:52.468 ERROR:teuthology.contextutil:Saw exception from nested tasks Traceback (most recent call last): File "/home/teuthworker/teuthology-master/teuthology/contextutil.py", line 29, in nested yield vars File "/home/teuthworker/teuthology-master/teuthology/task/ceph.py", line 1458, in task osd_scrub_pgs(ctx, config) File "/home/teuthworker/teuthology-master/teuthology/task/ceph.py", line 1090, in osd_scrub_pgs 'ceph', 'osd', 'scrub', role]) File "/home/teuthworker/teuthology-master/teuthology/orchestra/remote.py", line 114, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/teuthology-master/teuthology/orchestra/run.py", line 385, in run r.exitstatus = _check_status(r.exitstatus) File "/home/teuthworker/teuthology-master/teuthology/orchestra/run.py", line 381, in _check_status command=r.command, exitstatus=status, node=name) CommandFailedError: Command failed on plana59 with status 11: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd scrub osd.4'
archive_path: /var/lib/teuthworker/archive/teuthology-2014-05-18_23:00:04-rbd-master-testing-basic-plana/261772 branch: master description: rbd/librbd/{cache/none.yaml cachepool/small.yaml clusters/fixed-3.yaml fs/btrfs.yaml msgr-failures/few.yaml workloads/qemu_xfstests.yaml} email: null exclude_arch: armv7l job_id: '261772' kernel: &id001 kdb: true sha1: 335cb91ce950ce0e12294af671c64a468d89194c last_in_suite: false machine_type: plana name: teuthology-2014-05-18_23:00:04-rbd-master-testing-basic-plana nuke-on-error: true os_type: ubuntu overrides: admin_socket: branch: master ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 1 debug paxos: 20 osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 osd op thread timeout: 60 osd sloppy crc: true fs: btrfs log-whitelist: - slow request - wrongly marked me down sha1: 991f7f15a6e107b33a24bbef1169f21eb7fcce2c ceph-deploy: branch: dev: master conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: sha1: 991f7f15a6e107b33a24bbef1169f21eb7fcce2c s3tests: branch: master workunit: sha1: 991f7f15a6e107b33a24bbef1169f21eb7fcce2c owner: scheduled_teuthology@teuthology roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - - client.0 suite: rbd targets: ubuntu@plana32.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDmERfjlurKX631Ys98uSfL1mMJkRRZRRV5Hhen56sub04bFDz7W9zjh3Zs9pNMfdc1dWLf8IcpbdfcbR7cmkyfxQlLl+KmCwvRED+ZCR8P5HlkMFb+HnTdvyLAbu/4pvQRxrjy1GyQdNRUpxA8WWbfHrlz8leZPz3u3+hsHaCt8W0Y8cBpqmdTUtSgaGa9JTo/GWSkavF81o5xuVD+A4TGwNwTqIbb1f/HXAytffUwKr5fHHs1+hm1aT9GzQSumDHVCf9ykbcvO2uR70JZl3lZW2pVeFwQmq0AwmD5SetofuQK4ykVweONstnPwNGBqZJ/1A8jbxcby94RhDztzTqb ubuntu@plana59.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDaHG83zRXo6ydv6IGWDFTf6YNjWG9M5LRbbYIpPXKOqCg9zfI/4ZjymLpznESFIACVrqe06jqD7uvsQPOlbcm3W/H44su70C21KrzMs77IpskMT7tYgCzY75uxbwg949qYIRf1SEY2RW0Bf2zldbOeKAY/TcnGIkLtc4NCIDPfCxMG0rAJJgUAwbvbKVUqLKe/jcyu3RiiAxV3TGjTAzTz+XHwT46gDXB5Fxt49Sfx+AgpILHk7DvN/HILtU3gRT9ac0D2WlQi1sJLDgjeTAZxyfpRR5iZH4tWYBFIS7C4ugHYye95zUYTc/3Jt364Jl/giUherGjE5od7p65VjxRJ ubuntu@plana81.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDlJFZzGyQQUsAwKLFwKdofbzaj9hARIwmyH7nJWNlnwI++pPIPaKpFfnoOZnByS9xjb7Mmm3/Kg8YVg0diIAfapqpM/wTNPSyeYsjChLZDEJdPpo/0YlBxiEiy1655kOOdnPz67mr5YnNQJw1un16SRPVaXkMX4USSwrf0wOj685m7vo2dQOUzcfrC8liIrqBd/OU1LnRPtOdieTlJTOHyxNVW8ge8Q+y///lF7+pXkaLK743CZSU4PV/5P6NsahCD584A86Ucf1++F1w2BcNOBmLRplNIvoOycnCHeLfyzYfwmU1bY6vI4Kqjg2UmJ0lcJLKUu2FH1ert8+D+ufUL tasks: - internal.lock_machines: - 3 - plana - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.serialize_remote_roles: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - kernel: *id001 - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - clock.check: null - install: null - ceph: conf: client: rbd cache: false - exec: client.0: - ceph osd pool create cache 4 - ceph osd tier add rbd cache - ceph osd tier cache-mode cache writeback - ceph osd tier set-overlay rbd cache - ceph osd pool set cache hit_set_type bloom - ceph osd pool set cache hit_set_count 8 - ceph osd pool set cache hit_set_period 60 - ceph osd pool set cache target_max_objects 250 - qemu: all: num_rbd: 2 test: https://ceph.com/git/?p=ceph.git;a=blob_plain;f=qa/run_xfstests_qemu.sh type: block teuthology_branch: master verbose: true worker_log: /var/lib/teuthworker/archive/worker_logs/worker.plana.19208
client.0-kernel-sha1: 335cb91ce950ce0e12294af671c64a468d89194c description: rbd/librbd/{cache/none.yaml cachepool/small.yaml clusters/fixed-3.yaml fs/btrfs.yaml msgr-failures/few.yaml workloads/qemu_xfstests.yaml} duration: 25796.29017996788 failure_reason: 'Command failed on plana59 with status 1: ''sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 4''' flavor: basic mon.a-kernel-sha1: 335cb91ce950ce0e12294af671c64a468d89194c mon.b-kernel-sha1: 335cb91ce950ce0e12294af671c64a468d89194c owner: scheduled_teuthology@teuthology success: false
Updated by Yuri Weinstein almost 10 years ago
- Project changed from teuthology to Ceph
- Severity changed from 3 - minor to 2 - major
Moved to ceph
Updated by Yuri Weinstein almost 10 years ago
I was able to reproduce this issue on manual re-run.
Logs can be found on 'yw' box (ssh from teuthology) in /home/ubuntu/logs/261772
Error on osd.1 crash:
2014-05-19T15:11:12.965 INFO:teuthology.orchestra.run.plana70.stderr:osd.0 instructed to scrub 2014-05-19T15:11:12.976 INFO:teuthology.task.ceph:Scrubbing osd osd.1 2014-05-19T15:11:12.977 INFO:teuthology.orchestra.run.plana70:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd scrub osd.1' 2014-05-19T15:11:13.236 INFO:teuthology.orchestra.run.plana70.stderr:Error EAGAIN: osd.1 is not up 2014-05-19T15:11:13.248 ERROR:teuthology.contextutil:Saw exception from nested tasks Traceback (most recent call last): File "/home/ubuntu/bkup/teuthology/teuthology/contextutil.py", line 29, in nested yield vars File "/home/ubuntu/bkup/teuthology/teuthology/task/ceph.py", line 1458, in task osd_scrub_pgs(ctx, config) File "/home/ubuntu/bkup/teuthology/teuthology/task/ceph.py", line 1090, in osd_scrub_pgs 'ceph', 'osd', 'scrub', role]) File "/home/ubuntu/bkup/teuthology/teuthology/orchestra/remote.py", line 114, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/ubuntu/bkup/teuthology/teuthology/orchestra/run.py", line 385, in run r.exitstatus = _check_status(r.exitstatus) File "/home/ubuntu/bkup/teuthology/teuthology/orchestra/run.py", line 381, in _check_status command=r.command, exitstatus=status, node=name) CommandFailedError: Command failed on plana70 with status 11: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd scrub osd.1' 2014-05-19T15:11:13.250 INFO:teuthology.misc:Shutting down mds daemons...
Updated by Yuri Weinstein almost 10 years ago
coredump from ceph-osd.4.log @ ubuntu@teuthology:/a/teuthology-2014-05-18_23:00:04-rbd-master-testing-basic-plana/261772/remote/plana59/log$
0> 2014-05-19 07:31:11.905597 7fe3c0d43700 -1 *** Caught signal (Aborted) ** in thread 7fe3c0d43700 ceph version 0.80-469-g991f7f1 (991f7f15a6e107b33a24bbef1169f21eb7fcce2c) 1: ceph-osd() [0x99479a] 2: (()+0xfcb0) [0x7fe3d8df8cb0] 3: (gsignal()+0x35) [0x7fe3d72f3425] 4: (abort()+0x17b) [0x7fe3d72f6b8b] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fe3d7c4669d] 6: (()+0xb5846) [0x7fe3d7c44846] 7: (()+0xb5873) [0x7fe3d7c44873] 8: (()+0xb596e) [0x7fe3d7c4496e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0xa75c3f] 10: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x1dbf) [0x820fbf] 11: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x2d15) [0x82ab15] 12: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x692) [0x7c6142] 13: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1ca) [0x651e0a] 14: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x628) [0x6528b8] 15: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x6831fc] 16: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0xa66456] 17: (ThreadPool::WorkThread::entry()+0x10) [0xa68260] 18: (()+0x7e9a) [0x7fe3d8df0e9a] 19: (clone()+0x6d) [0x7fe3d73b13fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions