Project

General

Profile

Actions

Bug #5764

closed

mon: problem with pgmap upgrade_format, then sync

Added by Sage Weil almost 11 years ago. Updated almost 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2/3 mons upgrade and form quorum and are happy
last one starts up, loads latest (old format) pgmap
does a sync
does refresh, tries to load old inc maps as new format, crashes

i think the simple fix is to detect if in-core is format 0 but ondisk is format 1, then zero out the in-core pgmap so we do a full load and ignore teh bogus incrmentals.

we could also clean out the old inc maps when we do the upgrade, just to be tidy.

job was:

ubuntu@teuthology:/a/teuthology-2013-07-26_07:45:55-upgrade-parallel-next-testing-basic-plana/84390$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: 88b7f22bc0e44db48a24af23e4de3653bc44b2d2
machine_type: plana
nuke-on-error: true
os_type: ubuntu
overrides:
  admin_socket:
    branch: next
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
    log-whitelist:
    - slow request
    sha1: 4aeb73a5e6d46f970dbd684a58f795d379a04bd9
  ceph-deploy:
    branch:
      dev: next
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
  install:
    ceph:
      sha1: 4aeb73a5e6d46f970dbd684a58f795d379a04bd9
  s3tests:
    branch: next
  workunit:
    sha1: 4aeb73a5e6d46f970dbd684a58f795d379a04bd9
roles:
- - mon.a
  - mds.a
  - osd.0
  - osd.1
- - mon.b
  - mon.c
  - osd.2
  - osd.3
- - client.0
tasks:
- chef: null
- clock.check: null
- install:
    branch: cuttlefish
- ceph: null
- parallel:
  - workload
  - upgrade-sequence
teuthology_branch: next
upgrade-sequence:
  sequential:
  - install.upgrade:
      all:
        branch: next
  - ceph.restart:
    - osd.0
  - sleep:
      duration: 60
  - ceph.restart:
    - osd.1
  - sleep:
      duration: 60
  - ceph.restart:
    - osd.2
  - sleep:
      duration: 60
  - ceph.restart:
    - osd.3
  - sleep:
      duration: 60
  - ceph.restart:
      daemons:
      - mon.a
      wait-for-healthy: false
      wait-for-osds-up: true
  - sleep:
      duration: 60
  - ceph.restart:
      daemons:
      - mon.b
      wait-for-healthy: false
      wait-for-osds-up: true
  - sleep:
      duration: 60
  - ceph.restart:
    - mon.c
  - sleep:
      duration: 60
  - ceph.restart:
    - mds.a
  - sleep:
      duration: 60
workload:
  workunit:
    branch: cuttlefish
    clients:
      client.0:
      - rados/load-gen-mix.sh

Actions #1

Updated by Sage Weil almost 11 years ago

  • Status changed from 12 to 7
Actions #2

Updated by Sage Weil almost 11 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF