Project

General

Profile

Actions

Bug #6057

closed

osd: log bound mismatch after bobtail -> dumpling -> next upgrade

Added by Sage Weil over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
OSD
Target version:
-
% Done:

100%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2013-08-19 06:24:27.814763 osd.2 0.0.0.0:6808/24619 1 : [ERR] 3.4 log bound mismatch, info (0''0,30''164] actual [21''1,23''162]" in cluster log

ubuntu@teuthology:/a/teuthology-2013-08-19_01:31:11-upgrade-next-testing-basic-plana/1453$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: 546140dd51e9ec7e34fe0b0a5814240828f68f7d
machine_type: plana
nuke-on-error: true
os_type: ubuntu
overrides:
  admin_socket:
    branch: next
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
    log-whitelist:
    - slow request
    sha1: b007b3304c2020aa9f122ec6fef83a909053db3a
  ceph-deploy:
    branch:
      dev: next
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
  install:
    ceph:
      sha1: b007b3304c2020aa9f122ec6fef83a909053db3a
  s3tests:
    branch: next
  workunit:
    sha1: b007b3304c2020aa9f122ec6fef83a909053db3a
roles:
- - mon.a
  - mds.a
  - osd.0
  - osd.1
- - mon.b
  - mon.c
  - osd.2
  - osd.3
- - client.0
tasks:
- chef: null
- clock.check: null
- install:
    branch: bobtail
- ceph: null
- rgw: null
- s3tests:
    client.0:
      force-branch: bobtail
      rgw_server: client.0
- install.upgrade:
    all:
      branch: dumpling
- ceph.restart:
  - mon.a
  - mon.b
  - mon.c
  - mds.a
  - osd.0
  - osd.1
  - osd.2
  - osd.3
  - rgw.client.0
- s3readwrite:
    client.0:
      readwrite:
        bucket: rwtest
        duration: 300
        files:
          num: 10
          size: 2000
          stddev: 500
        readers: 10
        writers: 3
      rgw_server: client.0
- install.upgrade:
    all:
      branch: next
- ceph.restart:
  - mon.a
  - mon.b
  - mon.c
  - mds.a
  - osd.0
  - osd.1
  - osd.2
  - osd.3
  - rgw.client.0
- s3readwrite:
    client.0:
      readwrite:
        bucket: rwtest
        duration: 300
        files:
          num: 10
          size: 2000
          stddev: 500
        readers: 10
        writers: 3
      rgw_server: client.0
teuthology_branch: next

there are about ~30 identical failures


Subtasks 1 (0 open1 closed)

Bug #6058: upgrading from bobtail to dumpling to next: log bound mismatch and wrong node message in the logsDuplicateYehuda Sadeh08/19/2013

Actions
Actions #1

Updated by Tamilarasi muthamizhan over 10 years ago

  • Subject changed from osd: log bound mismatch after bobtail -> dumpling upgrade to osd: log bound mismatch after bobtail -> dumpling -> next upgrade
Actions #2

Updated by Sage Weil over 10 years ago

<initial start> as bobtail
...
2013-08-19 16:12:06.255051 7f835a7f8700 10 osd.0 pg_epoch: 27 pg[4.0( v 27'7 (0'0,27'7] local-les=12 n=3 ec=10 les/c 12/12 10/10/10) [1,0] r=1 lpr=12 luod=0'0 lcod 15'6 active] sub_op_modify_applied on 0x327ea80 op osd_sub_op(client.4110.0:5294 4.0 a45046b8/gc.12/head//4 [] v 27'7 snapset=0=[]:[] snapc=0=[$
...
<restart> at 0.67.1-7-g96d719e
...
2013-08-19 16:18:59.394638 7fcce4066780 10 osd.0 27 load_pgs loaded pg[4.0( v 27'7 (0'0,27'7] local-les=12 n=3 ec=10 les/c 12/12 10/10/10) [1,0] r=1 lpr=27 lcod 0'0 inactive NOTIFY] log(0'0,27'7]
....
2013-08-19 16:25:16.516233 7fccd90b6700 10 osd.0 pg_epoch: 34 pg[4.0( v 34'538 (0'0,34'538] local-les=32 n=3 ec=10 les/c 32/32 31/31/31) [1,0] r=1 lpr=31 pi=10-30/4 luod=0'0 lcod 34'537 active] sub_op_modify_applied on 0x2629200 op osd_sub_op(client.4716.0:38049 4.0 990e66d8/gc.0/head//4 [] v 34'538 snapse$
...
<restart> as 0.67-238-g68c1c70
...
2013-08-19 16:38:42.403570 7f0b8b909780 10 osd.0 34 load_pgs loaded pg[4.0( v 34'538 (0'0,34'538] local-les=32 n=3 ec=10 les/c 32/32 31/31/31) [1,0] r=1 lpr=34 pi=10-30/4 (log bound mismatch, actual=[13'1,27'7]) lcod 0'0 inactive NOTIFY] log(0'0,34'538]
Actions #3

Updated by Sage Weil over 10 years ago

fatty:/home/sage/tmp/6057/ceph-osd.0.log for the full log

Actions #4

Updated by Samuel Just over 10 years ago

wip-6057

Actions #5

Updated by Samuel Just over 10 years ago

Ran the above yaml on wip-6057, seems to work.

Actions #6

Updated by Sage Weil over 10 years ago

  • Status changed from New to Resolved

yay, tested ok for me too. merged and backported

Actions

Also available in: Atom PDF