Project

General

Profile

Actions

Bug #6931

closed

upgrade from cuttlefish to dumpling to emperor hung on ceph health

Added by Tamilarasi muthamizhan over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Category:
-
% Done:

100%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

logs: ubuntu@teuthology:/a/teuthology-2013-12-03_19:35:19-upgrade-dumpling-dumpling-testing-basic-plana/129646

2013-12-03T23:00:23.839 INFO:teuthology.task.ceph.mon.c:Started
2013-12-03T23:00:23.839 INFO:teuthology.task.ceph:Waiting until ceph is healthy...
2013-12-03T23:00:23.839 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd dump --format=json'
2013-12-03T23:00:24.053 INFO:teuthology.task.ceph.mon.c.out:[10.214.132.23]: starting mon.c rank 2 at 10.214.132.23:6790/0 mon_data /var/lib/ceph/mon/ceph-c fsid 87510398-5182-466b-b803-700e8cf3be15
2013-12-03T23:00:24.058 DEBUG:teuthology.misc:6 of 6 OSDs are up
2013-12-03T23:00:24.058 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:24.438 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:25.438 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:28.661 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:29.661 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:29.879 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:30.879 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:31.098 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:32.099 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:32.315 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:33.315 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:33.535 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:34.535 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:34.753 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:35.753 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:35.975 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:36.975 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:37.194 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:38.194 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:38.411 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:39.412 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:39.635 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)
2013-12-03T23:00:40.635 DEBUG:teuthology.orchestra.run:Running [10.214.132.15]: '/home/ubuntu/cephtest/adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2013-12-03T23:00:40.854 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN too few pgs per osd (12 < min 20)

config file:
ubuntu@teuthology:/a/teuthology-2013-12-03_19:35:19-upgrade-dumpling-dumpling-testing-basic-plana/129646$ cat config.yaml 
archive_path: /var/lib/teuthworker/archive/teuthology-2013-12-03_19:35:19-upgrade-dumpling-dumpling-testing-basic-plana/129646
description: upgrade-dumpling/fs/{0-cluster/start.yaml 1-dumpling-install/cuttlefish.v0.67.1.yaml
  2-workload/blogbench.yaml 3-upgrade-sequence/upgrade-mds-mon-osd.yaml 4-final/osdthrash.yaml}
email: null
job_id: '129646'
kernel: &id001
  kdb: true
  sha1: 2bb68a10807d507f4098709f890b36da2565d4d6
last_in_suite: false
machine_type: plana
name: teuthology-2013-12-03_19:35:19-upgrade-dumpling-dumpling-testing-basic-plana
nuke-on-error: true
os_type: ubuntu
overrides:
  admin_socket:
    branch: dumpling
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
      osd:
        debug ms: 1
        debug osd: 5
    fs: xfs
    log-whitelist:
    - slow request
    - scrub
    - wrongly marked me down
    - objects unfound and apparently lost
    - log bound mismatch
    sha1: 8cd33e3a8ebf7c2aa796ec9f92d6b554c39ff705
  ceph-deploy:
    branch:
      dev: dumpling
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
  install:
    ceph:
      sha1: 8cd33e3a8ebf7c2aa796ec9f92d6b554c39ff705
  s3tests:
    branch: dumpling
  workunit:
    sha1: 8cd33e3a8ebf7c2aa796ec9f92d6b554c39ff705
owner: scheduled_teuthology@teuthology
roles:
- - mon.a
  - mds.a
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mon.c
  - osd.3
  - osd.4
  - osd.5
  - client.0
targets:
  ubuntu@plana55.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDm6DEN/zsqfanzvAKl0lINJxesTzb7ejxRCDbrOH7tj47FUayjsmBssi16Rxd8bBlleGeZ4DsNcgIdO1euNIT6PyPMowH4blJHdAPoQVH3q1fuvKbLX9TxQk4BeqhL2HaVVtP1/gMT9KJicpGTbLzkDV+OnR3mGhwbM1hQGHR5+F9+gwcZqosJDFtHVmf5027Q027d0FEZaIra3TepjHgqDKXtFOZvfFCVX9QKj9kKT6EhuGnYmsAQNJsEGf5JOKzucmdyto520YTr9y7sIoJMqCaBjhWjjRmJNZROVebvXkckXNdAw6Qzi1m16UHo01SECovD3vaLWC5Lv+T29rOL
  ubuntu@plana63.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDXFu3H/FZPa959qQb84rw3UusSzxbN1gK+JkuL7U6tL4ebmbFpEGOigEQM+qUIPcjiQw8/AJDZ17WbnQfZ90Ffv0MCoSsYRq2AefBOPaUcEuCj/QhoI+3E7a6XU/HSrA9V02EIur2Oe3cZAX6tv1PDylNLC5cTpWHr5LGO/DsNEKKhMPjCyfzBzGDTuo4sxtcK8d370Y+QHiJPtjJAbYkb/DeqT2py8ST6N8qWZgDJK2zDr4OxWz90GkBgCxWdCE0/D/hs4k82PDfB2Zd2CIpkl4CB96wzavBzrgr9wiVotrAEJFtxNDYHMLtbRg9d8GzxbcjxH8OkxOsgsgRB1SKj
tasks:
- internal.lock_machines:
  - 2
  - plana
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install:
    branch: cuttlefish
- ceph: null
- install.upgrade:
    all:
      tag: v0.67.1
- ceph.restart: null
- parallel:
  - workload
  - upgrade-sequence
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    timeout: 1200
- ceph-fuse: null
- workunit:
    clients:
      all:
      - suites/iogen.sh
teuthology_branch: dumpling
upgrade-sequence:
  sequential:
  - install.upgrade:
      all:
        branch: emperor
  - ceph.restart:
    - mds.a
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.a
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.b
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.c
  - sleep:
      duration: 60
  - ceph.restart:
    - osd.0
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.1
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.2
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.3
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.4
  - sleep:
      duration: 30
  - ceph.restart:
    - osd.5
verbose: true
workload:
  workunit:
    clients:
      all:
      - suites/blogbench.sh

Actions #1

Updated by Greg Farnum over 10 years ago

  • Project changed from Ceph to teuthology
  • Status changed from New to Need More Info
  • Assignee set to Tamilarasi muthamizhan

This is Ceph behaving as designed and warning on a probable data imbalance — the test needs to be fixed to whitelist this or to create more PGs in the older versions. Unless I'm missing something?

Actions #2

Updated by Greg Farnum over 10 years ago

Or as Sage suggested on irc, change mon_pg_warn_min_per_osd to a lower number on the monitor configs.

Actions #3

Updated by Sandon Van Ness over 10 years ago

  • Status changed from Need More Info to Resolved
  • % Done changed from 0 to 100

Added the mon_pg_warn_min_per_osd to the ceph.conf template in teuthology for cuttlefish/dumpling. I will re-open if the issue happens again.

Actions

Also available in: Atom PDF