Project

General

Profile

Actions

Bug #6220

closed

nightlies: upgrade-parallel suite has tests that are hung at ceph quorum status or ceph health commands

Added by Tamilarasi muthamizhan over 10 years ago. Updated about 10 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

logs: ubuntu@teuthology:/a/teuthology-2013-09-02_01:35:04-upgrade-parallel-next-testing-basic-vps


17484       (pid 25220) 2013-09-03T15:26:24.728 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 25 pgs degraded; 23 pgs stale; 23 pgs stuck stale; 25 pgs stuck unclean; 4 requests are blocked > 32 sec; recovery 5233/20522 degraded (25.499%); mds cluster is degraded 
17486       (pid 29309) 2013-09-03T15:26:25.526 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 3 pgs peering; 3 pgs stale; 3 pgs stuck inactive; 3 pgs stuck stale; 3 pgs stuck unclean; 7 requests are blocked > 32 sec 
17489       (pid 6362) 2013-09-03T15:26:25.253 INFO:teuthology.task.workunit.client.0.err:[10.214.138.104]: 2013-09-03 22:26:39.606465 7f8b21fa3700  0 -- 10.214.138.104:0/1004924 >> 10.214.138.110:6789/0 pipe(0x7f8b00008ec0 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8b00009f90).fault 
17501       (pid 27105) 2013-09-03T15:26:24.767 DEBUG:teuthology.task.ceph:Quorum: [u'a', u'b', u'c'] 
17506       (pid 5083) 2013-09-03T15:26:24.781 DEBUG:teuthology.task.ceph:Quorum: [u'a', u'b', u'c'] 
17521       (pid 32261) 2013-09-03T15:26:24.806 DEBUG:teuthology.task.ceph:Quorum: [u'a', u'b', u'c'] 

ubuntu@teuthology:/a/teuthology-2013-09-02_01:35:04-upgrade-parallel-next-testing-basic-vps/17484$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: 263cbbcaf605e359a46e30889595d82629f82080
machine_type: vps
nuke-on-error: true
os_type: ubuntu
os_version: '12.04'
overrides:
  admin_socket:
    branch: next
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
    log-whitelist:
    - slow request
    sha1: e48d6cb4023fb3735e9c4288f5d5c7bac44eadde
  ceph-deploy:
    branch:
      dev: next
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
  install:
    ceph:
      sha1: e48d6cb4023fb3735e9c4288f5d5c7bac44eadde
  s3tests:
    branch: next
  workunit:
    sha1: e48d6cb4023fb3735e9c4288f5d5c7bac44eadde
roles:
- - mon.a
  - mds.a
  - osd.0
  - osd.1
- - mon.b
  - mon.c
  - osd.2
  - osd.3
- - client.0
targets:
  ubuntu@vpm117.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCmsWHlifzoOiRR0foj3bziCIH2g4C39aJZhON217owgXwxNaOn9DzqiXZiUGBu6HuB0Go0Tw+Kd8M402LZpjA8oB8uiRb75IN4yTmCuvL0CJdH32BrAWBMfQFikNWv4eKWMROTD5wofG2CVXw8uuncvRWAsNuoiqJh+DSCjI7yI46SdZxOsEWW97qSBjWp8NvpBzmYdxf7iovFpP3dFziClsvoxz1oSHtiXrJYulvU1TyD+9jSuqjYzZMc4/hXoqYjQg/NReaYaeQfaZcha/9RyK54I0El924QP4sGrvCS2HKzKgG12QZpMhE1tdws51urlEzsAxJsvLNzLM9qJTox
  ubuntu@vpm119.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDcXWXK3Rv7gMBMZ3SAWfGDURTeRhfp6mABlWZXXjzIJAIUFTkQ6Gw9aMQ1oYGSAWWVpJMWsxZYnd7g4KF/+eg3xjB9sTjbgQTs/IvjZLGEtVPyhyy9xEBmwvpBFiRW0B6N7/5bRNYbp55f5q9lAC9PTyh+eFyUwrPaGGvWvb3zaTmxbIz0l6UUBjfFSjjsPb3hL6yd/LGJ0W7JOW+WBx4dpEC+lK+sTFHUK595Im6d9iZ2b89MjokiYuQ/LlzYcCAWD/OFLy1s8J+yzZKOJ5k+2WgUxr/8ruWQZIrso2/2w7Wa5FEXhbfbGABxrSoKdkdqy3VT9mpspou9UtcUV+11
  ubuntu@vpm120.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCacfuCFCKth37etaO/Hdq98hQ5IAu5HPhWSpNSf7YqCCu/6F3MJTS7zZnUN2cpFKMQbgO+SnKZ4pvyZY+QoUuyltrEM3B8iRjiAw2i2ALmdrgPnfkuqyU345I4GgbAdmJqSO/HyVLz3Hy1nMgNR19fsbKfkl3IdtjArc/eNs2ReHz+03u0oaRQ7n+9AqdwSaeebHbdHtQf66+JrLO05ihg1MvmfisP2ZwHQNCQ5i/DOTVDDTDTcWlj9zlOpQnNoRGwvd/IkrX1rknnSbcthN8Z29uUFLslDDmAmRPgnShlud2CrAGcI3ypDWm8cRPzKAdYph4QCUqk2dtszVZYAOzr
tasks:
- internal.lock_machines:
  - 3
  - vps
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install:
    branch: dumpling
- ceph: null
- parallel:
  - workload
  - upgrade-sequence
teuthology_branch: next
upgrade-sequence:
  sequential:
  - install.upgrade:
      all:
        branch: next
  - ceph.restart:
    - mds.a
  - sleep:
      duration: 60
  - ceph.restart:
      daemons:
      - mon.a
      wait-for-healthy: false
      wait-for-osds-up: true
  - sleep:
      duration: 60
  - ceph.restart:
      daemons:
      - mon.b
      wait-for-healthy: false
      wait-for-osds-up: true
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.c
  - sleep:
      duration: 60
  - ceph.restart:
    - osd.0
  - sleep:
      duration: 60
  - ceph.restart:
    - osd.1
  - sleep:
      duration: 60
  - ceph.restart:
    - osd.2
  - sleep:
      duration: 60
  - ceph.restart:
    - osd.3
workload:
  workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/load-gen-mix.sh

Actions #1

Updated by Sage Weil over 10 years ago

  • Priority changed from Normal to Urgent
Actions #2

Updated by Tamilarasi muthamizhan over 10 years ago

  • Priority changed from Urgent to Normal

most of these errors where ceph health commands hang are due to sync issues on the VMs used for testing [bug # 6082].

Actions #3

Updated by Zack Cerza about 10 years ago

  • Status changed from New to Closed

I don't see how this is a teuthology bug. If it truly is, let's open a new one.

Actions

Also available in: Atom PDF