Project

General

Profile

Actions

Bug #9837

closed

Bug #9288: "Assertion `nlock == 0' failed" in upgrade:firefly-firefly-testing-basic-vps suite

rbd crash when upgrading from v0.80.5 to firefly

Added by Tamilarasi muthamizhan over 9 years ago. Updated over 9 years ago.

Status:
Duplicate
Priority:
High
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

logs: ubuntu@teuthology:/a/teuthology-2014-10-17_23:30:01-upgrade:firefly:newer-firefly-distro-basic-vps/555359

2014-10-18T21:57:49.898 INFO:tasks.workunit.client.0.vpm080.stderr:+ cmp /tmp/img /tmp/img2
2014-10-18T21:58:12.399 INFO:tasks.workunit.client.0.vpm080.stderr:+ cmp /tmp/img /tmp/img3
2014-10-18T21:58:46.441 INFO:tasks.workunit.client.0.vpm080.stderr:+ rm /tmp/img2 /tmp/img3
2014-10-18T21:58:46.708 INFO:tasks.workunit.client.0.vpm080.stderr:+ rbd import --new-format - testimg
2014-10-18T21:58:50.344 INFO:tasks.workunit.client.0.vpm080.stderr:+ rbd export testimg /tmp/img2
2014-10-18T21:58:51.951 INFO:tasks.workunit.client.0.vpm080.stderr:^MExporting image: 9% complete...^MExporting image: 99% complete...^MExporting image: 100% complete...done.
2014-10-18T21:58:51.953 INFO:tasks.workunit.client.0.vpm080.stderr:+ rbd export testimg -
2014-10-18T21:58:51.975 INFO:tasks.workunit.client.0.vpm080.stderr:*** Caught signal (Segmentation fault) **
2014-10-18T21:58:51.975 INFO:tasks.workunit.client.0.vpm080.stderr: in thread 7f8750d07760
2014-10-18T21:58:52.055 INFO:tasks.workunit.client.0.vpm080.stderr: ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
2014-10-18T21:58:52.055 INFO:tasks.workunit.client.0.vpm080.stderr: 1: rbd() [0x420d22]
2014-10-18T21:58:52.055 INFO:tasks.workunit.client.0.vpm080.stderr: 2: (()+0xf030) [0x7f874f5e3030]
2014-10-18T21:58:52.056 INFO:tasks.workunit.client.0.vpm080.stderr: 3: (std::_Rb_tree<std::string, std::pair<std::string const, ceph_mon_subscribe_item>, std::_Select1st<std::pair<std::string const, ceph_mon_subscribe_item> >, std::less<std::string>, std::allocator<std::pair<std::string const, ceph_mon_subscribe_item> > >::find(std::string const&) const+0x4f) [0x7f87508a73bf]
2014-10-18T21:58:52.056 INFO:tasks.workunit.client.0.vpm080.stderr: 4: (MonClient::_sub_want(std::string, unsigned long, unsigned int)+0x3f) [0x7f87508ab21f]
2014-10-18T21:58:52.056 INFO:tasks.workunit.client.0.vpm080.stderr: 5: (MonClient::authenticate(double)+0x74) [0x7f874fb90244]
2014-10-18T21:58:52.057 INFO:tasks.workunit.client.0.vpm080.stderr: 6: (librados::RadosClient::connect()+0x5f4) [0x7f874fa7f9a4]
2014-10-18T21:58:52.057 INFO:tasks.workunit.client.0.vpm080.stderr: 7: (main()+0x12c2) [0x410912]
2014-10-18T21:58:52.057 INFO:tasks.workunit.client.0.vpm080.stderr: 8: (__libc_start_main()+0xfd) [0x7f874e71bead]
2014-10-18T21:58:52.057 INFO:tasks.workunit.client.0.vpm080.stderr: 9: rbd() [0x415c09]
2014-10-18T21:58:52.058 INFO:tasks.workunit.client.0.vpm080.stderr:2014-10-19 04:58:52.053862 7f8750d07760 -1 *** Caught signal (Segmentation fault) **
2014-10-18T21:58:52.058 INFO:tasks.workunit.client.0.vpm080.stderr: in thread 7f8750d07760
2014-10-18T21:58:52.058 INFO:tasks.workunit.client.0.vpm080.stderr:
2014-10-18T21:58:52.058 INFO:tasks.workunit.client.0.vpm080.stderr: ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)
2014-10-18T21:58:52.059 INFO:tasks.workunit.client.0.vpm080.stderr: 1: rbd() [0x420d22]
2014-10-18T21:58:52.059 INFO:tasks.workunit.client.0.vpm080.stderr: 2: (()+0xf030) [0x7f874f5e3030]
2014-10-18T21:58:52.059 INFO:tasks.workunit.client.0.vpm080.stderr: 3: (std::_Rb_tree<std::string, std::pair<std::string const, ceph_mon_subscribe_item>, std::_Select1st<std::pair<std::string const, ceph_mon_subscribe_item> >, std::less<std::string>, std::allocator<std::pair<std::string const, ceph_mon_subscribe_item> > >::find(std::string const&) const+0x4f) [0x7f87508a73bf]
2014-10-18T21:58:52.059 INFO:tasks.workunit.client.0.vpm080.stderr: 4: (MonClient::_sub_want(std::string, unsigned long, unsigned int)+0x3f) [0x7f87508ab21f]
2014-10-18T21:58:52.060 INFO:tasks.workunit.client.0.vpm080.stderr: 5: (MonClient::authenticate(double)+0x74) [0x7f874fb90244]
2014-10-18T21:58:52.060 INFO:tasks.workunit.client.0.vpm080.stderr: 6: (librados::RadosClient::connect()+0x5f4) [0x7f874fa7f9a4]
2014-10-18T21:58:52.060 INFO:tasks.workunit.client.0.vpm080.stderr: 7: (main()+0x12c2) [0x410912]
2014-10-18T21:58:52.060 INFO:tasks.workunit.client.0.vpm080.stderr: 8: (__libc_start_main()+0xfd) [0x7f874e71bead]
2014-10-18T21:58:52.061 INFO:tasks.workunit.client.0.vpm080.stderr: 9: rbd() [0x415c09]
2014-10-18T21:58:52.061 INFO:tasks.workunit.client.0.vpm080.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2014-10-18T21:58:52.061 INFO:tasks.workunit.client.0.vpm080.stderr:

Actions #1

Updated by Tamilarasi muthamizhan over 9 years ago

  • Subject changed from rbd crash in firefly v0.80.5 to rbd crash in when upgrading from v0.80.5 to firefly
ubuntu@teuthology:/a/teuthology-2014-10-17_23:30:01-upgrade:firefly:newer-firefly-distro-basic-vps/555359$ cat orig.config.yaml 
archive_path: /var/lib/teuthworker/archive/teuthology-2014-10-17_23:30:01-upgrade:firefly:newer-firefly-distro-basic-vps/555359
branch: firefly
description: upgrade:firefly:newer/{0-cluster/start.yaml 1-install/v0.80.5.yaml 2-workload/{blogbench.yaml
  rbd.yaml s3tests.yaml testrados.yaml} 3-upgrade-sequence/upgrade-osd-mon-mds.yaml
  4-final/{monthrash.yaml osdthrash.yaml testrgw.yaml} distros/debian_7.0.yaml}
email: ceph-qa@ceph.com
job_id: '555359'
kernel:
  kdb: true
  sha1: distro
last_in_suite: false
machine_type: vps
name: teuthology-2014-10-17_23:30:01-upgrade:firefly:newer-firefly-distro-basic-vps
nuke-on-error: true
os_type: debian
os_version: '7.0'
overrides:
  admin_socket:
    branch: firefly
  ceph:
    conf:
      global:
        osd heartbeat grace: 100
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    fs: xfs
    log-whitelist:
    - slow request
    - scrub
    - scrub mismatch
    - ScrubResult
    - wrongly marked me down
    - objects unfound and apparently lost
    - log bound mismatch
    sha1: 5a10b95f7968ecac1f2af4abf9fb91347a290544
  ceph-deploy:
    branch:
      dev: firefly
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      sha1: 5a10b95f7968ecac1f2af4abf9fb91347a290544
  rgw:
    default_idle_timeout: 1200
  s3tests:
    branch: firefly
    idle_timeout: 1200
  workunit:
    sha1: 5a10b95f7968ecac1f2af4abf9fb91347a290544
owner: scheduled_teuthology@teuthology
priority: 1000
roles:
- - mon.a
  - mds.a
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mon.c
  - osd.3
  - osd.4
  - osd.5
- - client.0
  - client.1
suite: upgrade:firefly:newer
suite_branch: firefly
suite_path: /var/lib/teuthworker/src/ceph-qa-suite_firefly
tasks:
- chef: null
- clock.check: null
- install:
    tag: v0.80.5
- ceph: null
- parallel:
  - workload
  - upgrade-sequence
- sequential:
  - mon_thrash:
      revive_delay: 20
      thrash_delay: 1
  - ceph-fuse: null
  - workunit:
      clients:
        client.0:
        - suites/dbench.sh
- sequential:
  - thrashosds:
      chance_pgnum_grow: 1
      chance_pgpnum_fix: 1
      timeout: 1200
  - ceph-fuse:
    - client.0
  - workunit:
      clients:
        client.0:
        - suites/iogen.sh
- sequential:
  - rgw:
    - client.1
  - s3readwrite:
      client.0:
        readwrite:
          bucket: rwtest
          duration: 300
          files:
            num: 10
            size: 2000
            stddev: 500
          readers: 10
          writers: 3
        rgw_server: client.1
teuthology_branch: master
tube: vps
upgrade-sequence:
  sequential:
  - install.upgrade:
      all:
        branch: firefly
  - ceph.restart:
    - osd.0
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.1
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.2
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.3
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.4
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.5
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.a
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.b
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.c
  - sleep:
      duration: 60
  - ceph.restart:
    - mds.a
verbose: true
worker_log: /var/lib/teuthworker/archive/worker_logs/worker.vps.3038
workload:
  sequential:
  - workunit:
      clients:
        client.0:
        - suites/blogbench.sh
  - workunit:
      clients:
        client.0:
        - rbd/import_export.sh
      env:
        RBD_CREATE_ARGS: --new-format
  - workunit:
      clients:
        client.0:
        - cls/test_cls_rbd.sh
  - rgw:
    - client.0
  - s3tests:
      client.0:
        force-branch: firefly-original
        rgw_server: client.0
  - rados:
      clients:
      - client.0
      objects: 50
      op_weights:
        delete: 50
        read: 100
        rollback: 50
        snap_create: 50
        snap_remove: 50
        write: 100
      ops: 2000

Actions #2

Updated by Tamilarasi muthamizhan over 9 years ago

  • Subject changed from rbd crash in when upgrading from v0.80.5 to firefly to rbd crash when upgrading from v0.80.5 to firefly
Actions #3

Updated by Tamilarasi muthamizhan over 9 years ago

  • Status changed from New to Duplicate
  • Assignee deleted (Josh Durgin)
  • Priority changed from Urgent to High
  • Parent task set to #9288

this could be same as bug # 9288, modified the upgrade:firefly suite to NOT upgrade clients when workload is in progress.

Actions #4

Updated by Tamilarasi muthamizhan over 9 years ago

  • Assignee set to Yuri Weinstein
Actions

Also available in: Atom PDF