Project

General

Profile

Bug #2061

osd: scrub mismatch

Added by Sage Weil about 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

New one, "[ERR] 0.c scrub stat mismatch, got 6/6 objects, 2/5 clones, 13511948/13511948 bytes."

Workload was

  kernel:
    sha1: eda84b58922928516e6e62af85430b7c9705b6cf
  nuke-on-error: true
  overrides:
    ceph:
      coverage: true
      fs: xfs
      log-whitelist:
      - clocks not synchronized
      - old request
      sha1: b54bac3061666b1c781351154b1f3d78242709ec
  roles:
  - - mon.a
    - osd.0
    - osd.1
    - osd.2
  - - mds.a
    - client.0
    - osd.3
    - osd.4
    - osd.5
  tasks:
  - chef: null
  - ceph:
      log-whitelist:
      - wrongly marked me down or wrong addr
  - thrashosds: null
  - rados:
      clients:
      - client.0
      objects: 50
      op_weights:
        delete: 50
        read: 100
        snap_create: 50
        snap_remove: 50
        snap_rollback: 50
        write: 100
      ops: 4000

and "[ERR] 0.a backfill osd.1 stat mismatch on finish: num_bytes 512000 != expected 516096"
  kernel:
    sha1: eda84b58922928516e6e62af85430b7c9705b6cf
  nuke-on-error: true
  overrides:
    ceph:
      coverage: true
      fs: btrfs
      log-whitelist:
      - clocks not synchronized
      - old request
      sha1: b54bac3061666b1c781351154b1f3d78242709ec
  roles:
  - - mon.a
    - mds.a
    - osd.0
    - osd.1
  - - mon.b
    - mon.c
    - osd.2
  tasks:
  - chef: null
  - ceph:
      conf:
        osd:
          osd min pg log entries: 5
      log-whitelist:
      - wrongly marked me down or wrong addr
  - backfill: null

History

#1 Updated by Sage Weil about 8 years ago

oooooh, these went away and i was confused. but hten i just ran the regression suite against next and hit them again


12424: collection:thrash clusters:6-osd-2-machine.yaml fs:btrfs.yaml
thrashers:default.yaml workloads:snaps-few-objects.yaml
    "2012-02-17 13:53:34.665946 osd.3 10.3.14.174:6800/9147 140 : [ERR] 0.12
scrub stat mismatch, got 14/14 objects, 11/14 clones, 25710376/25710376 bytes." 
in cluster log
12429: collection:thrash clusters:6-osd-2-machine.yaml fs:xfs.yaml
thrashers:default.yaml workloads:snaps-few-objects.yaml
    "2012-02-17 14:17:32.195973 osd.0 10.3.14.175:6806/17153 53 : [ERR] 0.e
scrub stat mismatch, got 42/42 objects, 35/37 clones, 71067132/71067132 bytes." 
in cluster log

i bet these were fixed by the recovery refactor.

#2 Updated by Sage Weil about 8 years ago

  • Status changed from New to Resolved

pretty sure this was fixed by the recover refactor.. haven't hit it since then.

Also available in: Atom PDF