Project

General

Profile

Actions

Bug #8036

closed

levedb: throws std::bad_allow on 14.04

Added by Yuri Weinstein about 10 years ago. Updated about 10 years ago.

Status:
Can't reproduce
Priority:
High
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/177687/

014-04-08T02:10:47.043 INFO:teuthology.orchestra.run.err:[10.214.138.56]: marked in osd.0.
2014-04-08T02:10:47.275 INFO:teuthology.task.thrashosds.thrasher:Added osd 0
2014-04-08T02:10:52.276 INFO:teuthology.task.thrashosds.thrasher:in_osds:  [4, 1, 2, 0]  out_osds:  [5, 3] dead_osds:  [] live_osds:  [1, 4, 2, 3, 5, 0]
2014-04-08T02:10:52.276 INFO:teuthology.task.thrashosds.thrasher:choose_action: min_in 3 min_out 0 min_live 2 min_dead 0
2014-04-08T02:10:52.276 INFO:teuthology.task.thrashosds.thrasher:fixing pg num pool unique_pool_0
2014-04-08T02:10:52.277 DEBUG:teuthology.orchestra.run:Running [10.214.138.56]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format=json'
2014-04-08T02:10:58.356 INFO:teuthology.orchestra.run.err:[10.214.138.56]: Traceback (most recent call last):
2014-04-08T02:10:58.356 INFO:teuthology.orchestra.run.err:[10.214.138.56]:   File "/usr/bin/ceph", line 830, in <module>
2014-04-08T02:10:58.361 INFO:teuthology.orchestra.run.err:[10.214.138.56]:     sys.exit(main())
2014-04-08T02:10:58.362 INFO:teuthology.orchestra.run.err:[10.214.138.56]:   File "/usr/bin/ceph", line 590, in main
2014-04-08T02:10:58.362 INFO:teuthology.orchestra.run.err:[10.214.138.56]:     conffile=conffile)
2014-04-08T02:10:58.362 INFO:teuthology.orchestra.run.err:[10.214.138.56]:   File "/usr/lib/python2.7/dist-packages/rados.py", line 208, in __init__
2014-04-08T02:10:58.701 INFO:teuthology.orchestra.run.err:[10.214.138.56]:     self.librados = CDLL(librados_path)
2014-04-08T02:10:58.701 INFO:teuthology.orchestra.run.err:[10.214.138.56]:   File "/usr/lib/python2.7/ctypes/__init__.py", line 365, in __init__
2014-04-08T02:10:59.537 INFO:teuthology.orchestra.run.err:[10.214.138.56]:     self._handle = _dlopen(self._name, mode)
2014-04-08T02:10:59.537 INFO:teuthology.orchestra.run.err:[10.214.138.56]: OSError: librados.so.2: cannot map zero-fill pages: Cannot allocate memory
2014-04-08T02:12:01.166 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]: terminate called after throwing an instance of 'std::bad_alloc'
2014-04-08T02:12:01.166 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:   what():  std::bad_alloc
2014-04-08T02:12:01.166 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]: *** Caught signal (Aborted) **
2014-04-08T02:12:01.166 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  in thread 7febfedec700
2014-04-08T02:12:01.472 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  ceph version 0.79-42-g010dff1 (010dff12c38882238591bb042f8e497a1f7ba020)
2014-04-08T02:12:01.472 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  1: ceph-mon() [0x86967f]
2014-04-08T02:12:01.472 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  2: (()+0x10340) [0x7fec066ae340]
2014-04-08T02:12:01.472 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  3: (gsignal()+0x39) [0x7fec04982f79]
2014-04-08T02:12:01.472 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  4: (abort()+0x148) [0x7fec04986388]
2014-04-08T02:12:01.473 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fec0528e6b5]
2014-04-08T02:12:01.473 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  6: (()+0x5e836) [0x7fec0528c836]
2014-04-08T02:12:01.473 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  7: (()+0x5e863) [0x7fec0528c863]
2014-04-08T02:12:01.473 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  8: (()+0x5eaa2) [0x7fec0528caa2]
2014-04-08T02:12:01.473 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  9: (()+0x12c6e) [0x7fec068cec6e]
2014-04-08T02:12:01.473 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  10: (tc_new()+0x1e0) [0x7fec068eeb60]
2014-04-08T02:12:01.474 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  11: (std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&)+0x59) [0x7fec052e83b9]
2014-04-08T02:12:01.474 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  12: (std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long)+0x1b) [0x7fec052e8f7b]
2014-04-08T02:12:01.474 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  13: (std::string::reserve(unsigned long)+0x34) [0x7fec052e9014]
2014-04-08T02:12:01.474 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  14: (std::string::append(unsigned long, char)+0x46) [0x7fec052e93d6]
2014-04-08T02:12:01.474 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  15: (leveldb::TableBuilder::WriteBlock(leveldb::BlockBuilder*, leveldb::BlockHandle*)+0x75) [0x7fec05567295]
2014-04-08T02:12:01.474 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  16: (leveldb::TableBuilder::Flush()+0x5c) [0x7fec0556740c]
2014-04-08T02:12:01.475 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  17: (leveldb::TableBuilder::Add(leveldb::Slice const&, leveldb::Slice const&)+0xb7) [0x7fec05567597]
2014-04-08T02:12:01.475 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  18: (leveldb::BuildTable(std::string const&, leveldb::Env*, leveldb::Options const&, leveldb::TableCache*, leveldb::Iterator*, leveldb::FileMetaData*)+0x27e) [0x7fec05543bee]
2014-04-08T02:12:01.475 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  19: (leveldb::DBImpl::WriteLevel0Table(leveldb::MemTable*, leveldb::VersionEdit*, leveldb::Version*)+0x104) [0x7fec05549704]
2014-04-08T02:12:01.475 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  20: (leveldb::DBImpl::CompactMemTable()+0xe3) [0x7fec0554aec3]
2014-04-08T02:12:01.476 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  21: (leveldb::DBImpl::BackgroundCompaction()+0x36) [0x7fec0554be16]
2014-04-08T02:12:01.476 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  22: (leveldb::DBImpl::BackgroundCall()+0x62) [0x7fec0554c9b2]
2014-04-08T02:12:01.476 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  23: (()+0x38b3b) [0x7fec0556ab3b]
2014-04-08T02:12:01.476 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  24: (()+0x8182) [0x7fec066a6182]
2014-04-08T02:12:01.477 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  25: (clone()+0x6d) [0x7fec04a4730d]
2014-04-08T02:12:01.477 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]: 2014-04-08 09:12:01.463118 7febfedec700 -1 *** Caught signal (Aborted) **
2014-04-08T02:12:01.477 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  in thread 7febfedec700
2014-04-08T02:12:01.477 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]: 
2014-04-08T02:12:01.477 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  ceph version 0.79-42-g010dff1 (010dff12c38882238591bb042f8e497a1f7ba020)
2014-04-08T02:12:01.478 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  1: ceph-mon() [0x86967f]
2014-04-08T02:12:01.478 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  2: (()+0x10340) [0x7fec066ae340]
2014-04-08T02:12:01.478 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  3: (gsignal()+0x39) [0x7fec04982f79]
2014-04-08T02:12:01.478 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  4: (abort()+0x148) [0x7fec04986388]
2014-04-08T02:12:01.479 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fec0528e6b5]
2014-04-08T02:12:01.479 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  6: (()+0x5e836) [0x7fec0528c836]
2014-04-08T02:12:01.479 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  7: (()+0x5e863) [0x7fec0528c863]
2014-04-08T02:12:01.479 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  8: (()+0x5eaa2) [0x7fec0528caa2]
2014-04-08T02:12:01.479 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  9: (()+0x12c6e) [0x7fec068cec6e]
2014-04-08T02:12:01.480 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  10: (tc_new()+0x1e0) [0x7fec068eeb60]
2014-04-08T02:12:01.480 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  11: (std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&)+0x59) [0x7fec052e83b9]
2014-04-08T02:12:01.480 INFO:teuthology.task.ceph.mon.a.err:[10.214.138.56]:  12: (std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long)+0x1b) [0x7fec052e8f7b]
2014-04-08T02:14:21.939 ERROR:teuthology.run_tasks:Manager failed: thrashosds
Traceback (most recent call last):
  File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 92, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/teuthworker/teuthology-firefly/teuthology/task/thrashosds.py", line 172, in task
    thrash_proc.do_join()
  File "/home/teuthworker/teuthology-firefly/teuthology/task/ceph_manager.py", line 153, in do_join
    self.thread.get()
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
CommandFailedError: Command failed on 10.214.138.56 with status 1: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format=json'
archive_path: /var/lib/teuthworker/archive/teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/177687
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rados_api_tests.yaml
  6-next-mon/monb.yaml 7-workload/rados_api_tests.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml
  rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_14.04.yaml}
email: null
job_id: '177687'
kernel: &id001
  kdb: true
  sha1: distro
last_in_suite: false
machine_type: vps
name: teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps
nuke-on-error: true
os_type: ubuntu
os_version: '14.04'
overrides:
  admin_socket:
    branch: firefly
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
        mon warn on legacy crush tunables: false
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    log-whitelist:
    - slow request
    - wrongly marked me down
    - objects unfound and apparently lost
    - log bound mismatch
    sha1: 010dff12c38882238591bb042f8e497a1f7ba020
  ceph-deploy:
    branch:
      dev: firefly
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      sha1: 010dff12c38882238591bb042f8e497a1f7ba020
  s3tests:
    branch: master
  workunit:
    sha1: 010dff12c38882238591bb042f8e497a1f7ba020
owner: scheduled_teuthology@teuthology
roles:
- - mon.a
  - mon.b
  - mds.a
  - osd.0
  - osd.1
  - osd.2
- - osd.3
  - osd.4
  - osd.5
  - mon.c
- - client.0
targets:
  ubuntu@vpm031.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDq7gmPqczEb6bxQUuUXQFnR2z6vfoN2b7ICm7PljWcJH5vvT3dyid6rrbKq/I8zHWFYa7uBu0VEztFc1VkCwqpQwhrWnDM6xni7mKGLwMHfYX8+6BVCIqjesmQIaISRYFYIAiOeiHJFdmP+5B2hrQPkagvW59pqHESqJACjxHQ6FmOnUxk5oTNQSQJVIbxsYzqodh5jX46ZVrbDHb1v+YjBU2wieyJuA9Pua7g5seOOoeJ2e+ty2nlRjfhpmwZvXh0wMZhBbOaNUVJYouMx3l92a0bGYD/PXdcdC/bBFFHGTKI7BaA4snhR8pkI8hKosbckOFxXcFzFtfHkEYsEssH
  ubuntu@vpm032.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDK/wagN/I7tt/S7YeIvefzygjStwb2VyzJCjuXSpm9gnOWVC7xKMGG4oHM30pV/+C0VWYRePZqbPGO9+Qf5CDffuYVMJCTBOlGtHB7KyDxaoFBY4CKWrg2st/uDxXaoNkE1c8MgVglFOsOtmWS4lAPlbff0OL2a6FcnTRidXDo+5zvqWg1WArPGghNTzwJ73jk9zACFaiisQZx8Hd+ZM6Gz7V8SmcXEkNEHp9fJJsTWy+rh1b0yQTCKWvsJjj1O0ykwPdB/cnHigzuzPPJOxgpNWiRoswo74lC2d5iUd4yB9Vfirpj2/a60/r/CWP2Fy16lG6Xo1C+U3AkEY14cdvB
  ubuntu@vpm033.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCZGaf59QU1K0RVezxHArei9Y+UDyau5D7V8GqBYOMRMJ9E90vYvNw2dJZZI3C1Oj0SNc/BdjAlfpW/aRrYQ2xx8bCyvY3m6u3pocqO2EYfU8/wEaOc5THzsJvz6zxkKdhGl3BSs1w38qIwvxZAxDbAqelexzVdnQ1AAIkOXDU++uueTqPcvNFOzXegfbMoMp7yql2dbYUExkNTWJPhGRCSYa0zGKdiGTPOUqInsWkamaQPZy3SzMgB8Xjxs8E5joxggy+TxDMyP2VYH4gMgJIwfI2sHTv2/H6pJKeDEMyh1vduh5k1S+jtSu+TiTtD9rnlpQvrDf2Tz7Mi+vLe/3mT
tasks:
- internal.lock_machines:
  - 3
  - vps
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.sudo: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install:
    branch: dumpling
- ceph:
    fs: xfs
- install.upgrade:
    osd.0: null
- ceph.restart:
    daemons:
    - osd.0
    - osd.1
    - osd.2
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    thrash_primary_affinity: false
    timeout: 1200
- ceph.restart:
    daemons:
    - mon.a
    wait-for-healthy: false
    wait-for-osds-up: true
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test-upgrade-firefly.sh
- ceph.restart:
    daemons:
    - mon.b
    wait-for-healthy: false
    wait-for-osds-up: true
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test-upgrade-firefly.sh
- install.upgrade:
    mon.c: null
- ceph.restart:
    daemons:
    - mon.c
    wait-for-healthy: false
    wait-for-osds-up: true
- ceph.wait_for_mon_quorum:
  - a
  - b
  - c
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test-upgrade-firefly.sh
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rbd/test_librbd_python.sh
- rgw:
    client.0:
      idle_timeout: 120
- swift:
    client.0:
      rgw_server: client.0
- rados:
    clients:
    - client.0
    objects: 500
    op_weights:
      delete: 50
      read: 100
      rollback: 50
      snap_create: 50
      snap_remove: 50
      write: 100
    ops: 4000
teuthology_branch: firefly
verbose: true
worker_log: /var/lib/teuthworker/archive/worker_logs/worker.vps.17019
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rados_api_tests.yaml
  6-next-mon/monb.yaml 7-workload/rados_api_tests.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml
  rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_14.04.yaml}
duration: 14274.8196849823
failure_reason: 'Command failed on 10.214.138.56 with status 1: ''adjust-ulimits ceph-coverage
  /home/ubuntu/cephtest/archive/coverage ceph pg dump --format=json'''
flavor: basic
owner: scheduled_teuthology@teuthology
success: false

Related issues 1 (0 open1 closed)

Has duplicate Ceph - Bug #8067: mon: enomem on vps, killed at ~800MBDuplicate04/10/2014

Actions
Actions

Also available in: Atom PDF