Project

General

Profile

Bug #2954

osd: scrub stat mismatch, got 18/19 objects, 14/15 clones, 22478527/25385282 bytes.

Added by Sage Weil over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

1975/summary.yaml:failure_reason: '"2012-08-15 22:53:04.478924 osd.1 10.214.131.18:6804/4703 49 : [ERR]
1975/summary.yaml-  0.d scrub stat mismatch, got 18/19 objects, 14/15 clones, 22478527/25385282 bytes." 
1975/summary.yaml-  in cluster log'

job 1976 also failed with a similar error,

1976/summary.yaml:failure_reason: '"2012-08-15 22:55:22.294216 osd.3 10.214.132.4:6803/3396 47 : [ERR]
1976/summary.yaml-  0.17 scrub stat mismatch, got 13/12 objects, 10/10 clones, 19988294/17774057 bytes." 
1976/summary.yaml-  in cluster log'

this is on the bug-2947 branch (osd ordering fixes) that appear not to be 100% fixed yet (i hit one ordering failure on this test), so this may be a false alarm.

ubuntu@teuthology:/a/sage-a3/1975$ cat config.yaml 
kernel: &id001
  branch: foo1
  kdb: true
nuke-on-error: true
overrides:
  ceph:
    branch: bug-2947
    conf:
      client:
        debug ms: 20
        debug objecter: 20
      osd:
        debug filestore: 20
        debug ms: 20
        debug osd: 20
    fs: xfs
    log-whitelist:
    - slow request
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana02.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCtjMpSkaJhFqFtpo5AEe3KHygR+ueaWU+gYrrRzPa8YvmR0TCapw0kz77y1Fjcfh8rkTapnevpaYgQSMrMs0Yc34kF5XtNRuQXkpTwrhS8isZJBeNSc1W5XeKjj4KB/UuzBywJq0h/0KbH1DrMy72cGISOzdiP9CMA5KUvJo0m31wv1+MPcPn/5AhZgoWPStfaZdb4TaJUrNLrws0oRXa0yQbUa6WmUBsYhHsw4K1ukJAcJwVjcgAAv1N+GnyuWLVs+pvknBO3Whv1RhjY6EDGjun1MDPw+OE3wJsJX7BRr8eZv2Avi7pRlseWeWJwgsHMJ/j0yhf+SCy1+oSPrD2b
  ubuntu@plana22.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC6ZmsmnHcSY7O9viGtUzt5WebiPbwcXo9tg5qgWsaqn46DeegKbdsQ55ajysSUVVhvQA5hW6J9IYyZ5MjtlY2G/whyHYG85tNpAUiuedaQHmzARtL3URZmy2ZxwXgYyPHW3t1n0cu6KSb4pTv9vBjcaCouV2wgrinHAISzDOVuUeXdIhC8Tr3MB0nD1Gw6Xcak680XsQw6oYP6cM+yGCZ7sF15W8TR9IJGmphMIvtd8aTuBo9yet5rIxUfzpCM9Jiv+XgH2oT9h9WfacuL1uQ2C/dHUWoPynK36Uv2J785bfw/hVVtuSGu9Lb1n4o8p8Z88Ex4i8KaOxMiQAs3zqOx
  ubuntu@plana37.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDrxOb9f5/SfItd83HOnLVyJRnfji0fbdvL+3T82akjV6J4s/nyR8Bu+rpXbyUwu2BRDoxK4pT2dBqw86meq1qbU5Q1ypWBSH41MYGd213fy0g8YibFiYVGmXFCSwtY8X2Pet9vtLDoYvtnsgNI8djy5GPkQyZFKSszJHznZvQU10NWfM6RfxxtsBKXC/aot4QXb3GIym2/EmeuTAAef6p98dd15P9l9HQkpwXZLwiDZ53IbU79CTINo5HTD/6+1XHUcjb1OUKzQMx1jU485gW6IlsR0G0jJKSv+YEu4zSxxva7gWt1AYxGo2jhNDffEGLsNurzXFf9yeYshCTAszLf
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- rados:
    clients:
    - client.0
    objects: 50
    op_weights:
      delete: 50
      read: 100
      snap_create: 50
      snap_remove: 50
      snap_rollback: 50
      write: 100
    ops: 4000

History

#1 Updated by Sage Weil over 11 years ago

several more failures in /a/sage-a3 to look at.

#2 Updated by Tamilarasi muthamizhan over 11 years ago

logs: ubuntu@teuthology:/a/teuthology-2012-08-17_00:00:25-regression-next-testing-basic/2877

#3 Updated by Sage Weil over 11 years ago

  • Target version set to 83

another one. ms failure injection may have contributed.

ubuntu@teuthology:/a/sage-gfoo2/5925

#4 Updated by Sage Weil over 11 years ago

  • Target version changed from 83 to v0.52a

#5 Updated by Sage Weil over 11 years ago

  • Priority changed from High to Urgent

#6 Updated by Samuel Just over 11 years ago

  • Status changed from New to Resolved

Most likely fixed in b273c376ca6455f1e36be82cbc91606debd5fb1e.

#7 Updated by Tamilarasi muthamizhan over 11 years ago

  • Status changed from Resolved to In Progress
  • Assignee set to Samuel Just
  • Target version deleted (v0.52a)
  • Source changed from Development to Q/A

ubuntu@teuthology:/a/teuthology-2012-11-18_19:00:03-regression-master-testing-gcov/1220

2012-11-18 22:02:46.290679 osd.5 10.214.132.16:6806/5685 34 : [ERR] 0.4 scrub stat mismatch, got 22/23 objects, 8/9 clones, 35958664/38204884 bytes.
2012-11-18 22:02:46.290686 osd.5 10.214.132.16:6806/5685 35 : [ERR] 0.4 scrub 1 errors

ubuntu@teuthology:/a/teuthology-2012-11-18_19:00:03-regression-master-testing-gcov/1220$ cat summary.yaml 
ceph-sha1: 48295a188f8c12cfe8a172b598b96c88ec2e7089
client.0-kernel-sha1: 22cddde104d715600a4c218bf9224923208afe90
description: collection:rados-thrash clusters:6-osd-3-machine.yaml fs:ext4.yaml msgr-failures:few.yaml
  thrashers:default.yaml workloads:snaps-many-objects.yaml
duration: 2217.822732925415
failure_reason: '"2012-11-18 22:02:46.290679 osd.5 10.214.132.16:6806/5685 34 : [ERR]
  0.4 scrub stat mismatch, got 22/23 objects, 8/9 clones, 35958664/38204884 bytes." 
  in cluster log'
flavor: gcov
mds.a-kernel-sha1: 22cddde104d715600a4c218bf9224923208afe90
mon.a-kernel-sha1: 22cddde104d715600a4c218bf9224923208afe90
owner: scheduled_teuthology@teuthology
success: false
ubuntu@teuthology:/a/teuthology-2012-11-18_19:00:03-regression-master-testing-gcov/1220$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: 22cddde104d715600a4c218bf9224923208afe90
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 5000
    coverage: true
    fs: ext4
    log-whitelist:
    - slow request
    sha1: 48295a188f8c12cfe8a172b598b96c88ec2e7089
  s3tests:
    branch: master
  workunit:
    sha1: 48295a188f8c12cfe8a172b598b96c88ec2e7089
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana62.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC+nI5/l38Kdw2W/qbEKrVMcnVdIxJG7hNnD7nnS3+Zx/uPiWrds26ZPrM5IY7D8Mf7sjBzUYbqsX9xGYMLLTQaeDwsZn/7RjjSg8zOS1aMP5F/AJzSQx4Nt37eLUsRHX3yA30/OQcl6sBgDjHyhSPcSuHWSnMmoy4pkDo3xpQMQMtxDG8gWq+to1hZwJbsiK9FdutEgPJg3inWM1WVc5L6NmRN2WQNEGT8HvtlBCWqX6/H/hLujQlbgyJAbeG4BriMV3gCIccJE833f/fN9KIzaMlD7qHTgWcaGk+LY84nUdNlTkNoX+L4m6WRY8/Pt9om2dOocsXyCwYLIS4heIDT
  ubuntu@plana80.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCjbUncT43IZIcpSlFXnWDaagYNdnaLTfgsN8TKi3J2QYOB1BlMpaSRBByXr5MW2ZGuWrB77USjJLdRW/feuwtAjDbwc7y5woxcxn1u9eqPh4RyQWqyCmKWvi3GpdM4/4vh+L+7X5rNzJPStTjiPXCxHK+stPGCk9pg+J+KKg/GlJ1Lwx1DeWetdForGqEVJrTSSWj8RM+Nyw/V+c+t2d6gW1SzjY2NvaDdVHfduM+8X9F9aTf3KLitnIvNQWXzfqoEDNPSwrrO2uvTDyqpFkW6HLpZDnWm+io8Y9P7XMrEhWCTlQX2SHeQv/BhaA1btYAw7qfLC9n8RbeXLlQnc5yn
  ubuntu@plana81.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCzwyoEUguCYhu0TNdXrk7aWVUvePF6F6coBynLZ73Y/7eqTKAzxnNCiwUQx3kGjK33kliZDk7g/x4FdsiwzknDuGGCXP1pZyIGVtU5wNJ6ZM29XyH2SZvyU0MNfmoMzygxHR53TGcsK3hzwSbaW1woEpJmoqgIFQJ6BJr3nc2foKl79wBQdCNumqJ7sbh26xYVI29vYsJHUTYAdmyE4QrLaLOkZKU5Q/OvUnbKQbURcs7ArxFooObu9h0ENRPK4MKuxBFfgpAYTr/rMeWfVQxvSWsuOMpOdzLaNLt5UBYVVU+wxIFdDwcHb+2Er0rEV49W9xUD6JGXnaFjrxDDocfj
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- rados:
    clients:
    - client.0
    objects: 500
    op_weights:
      delete: 50
      read: 100
      snap_create: 50
      snap_remove: 50
      snap_rollback: 50
      write: 100
    ops: 4000

#8 Updated by Sage Weil over 11 years ago

also:

ubuntu@teuthology:/a/sage-2012-11-18_08:58:10-regression-wip-mon-leaks-fix-testing-basic/1100$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: 22cddde104d715600a4c218bf9224923208afe90
nuke-on-error: true
overrides:
  ceph:
    conf:
      client:
        rbd cache: true
        rbd cache max dirty: 0
      global:
        ms inject socket failures: 5000
    fs: xfs
    log-whitelist:
    - slow request
    sha1: 45c652d7721ed088b13359d09962cb61400ddff4
  s3tests:
    branch: master
  workunit:
    sha1: 45c652d7721ed088b13359d09962cb61400ddff4
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana26.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDx0US96hot7gygZ69W4nxJQ9myYnn3I22YtOaSPe+yFWOPJVQUuOST+aw5K6JDcjdO2Gq0aS6s01mgoWpZlO/FVDKss7vZ2KjMp3uPkGMpDZarNbR3QTe5YZYrl7Wfw4pMu4jh92hCWJEzy5nH0H3X2YJhOd5BdOYz0P97qsMSPQGxhlvDBYBhDl9MLgsS3lKm/Js/OPLO+Uf3/SZceCjUqO2m3WsrJSiQJKh8XUWUu3z+6C1Wg6TXSSlA/jdVCiokDg7WYwPN9zMwzzGkGv+GUGHKMZaPGRZb9LQJLTBf/OjwRSgclAVdDc3vnZeYAS5+sDnt2grnJnlBd1rBUj3n
  ubuntu@plana46.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCrD6s9otJ5xCNH4nyv0iJu6AoqmlNTFd8D0X9RfFBnmOMrMBWU9kwsFzPIOsuJGbSYbA8LCtjWUwaWoXmbEFtTMitxaDXp47gbVNXknHq7TGZHkWWOwKKu+tlSQBpCVzO/rzBbvJ9fcG7tewq5XcIHz0IUXsUFuEuXR1HaTUJKic2twBpaeAGNvdd6IZ9Sz9TMkfiRV/aVdcHJ/yF8bsXi3pfRPR3puMK/Nyfq5Hz/aabQo1TSyK2o0weoWV7D8vD6S8f3D7p5/5ScBhL3zUcP85SsV47W+/hTFbU8kN1Grlv2sx0fVMB/TUB/UNVdsHKGn5Nv6zb/qMqBEx9nSeZ9
  ubuntu@plana47.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCUMS+/Rfo92n0pY5cDrv+M9lss9i6+Zum4aa4aE54KsOcKkl+6yooZcZL8bllGLVL1W7BkaBOJ59dQwTVIo/UAgiKyA4J5IVwBPjwNNp4/mXzKJtKQPj0UrTCKsQrKasWPC+FVRzqJRK70cgC5D40znuopmfmENoPwCniOJALFCw3q8XLkcq1SH0jzDXJdsrnTVGxwRHYq9cF9J7fr6XZQXuAk7XO3jG1eqlF8xljmkvI0Ftux50TkOsDzpkscD5jHkxiFj/gkO2KR5GNbybdnxllHBAYuv2hoxrsW2oyIxbeforwZFV0DcDhRReRTx8BhXZ0o5erZgPgzS+ZbfWol
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- rbd_fsx:
    clients:
    - client.0
    ops: 2000
ubuntu@teuthology:/a/sage-2012-11-18_08:58:10-regression-wip-mon-leaks-fix-testing-basic/1100$ cat summary.yaml 
ceph-sha1: 45c652d7721ed088b13359d09962cb61400ddff4
client.0-kernel-sha1: 22cddde104d715600a4c218bf9224923208afe90
description: collection:rbd-thrash clusters:6-osd-3-machine.yaml fs:xfs.yaml msgr-failures:few.yaml
  thrashers:default.yaml workloads:rbd_fsx_cache_writethrough.yaml
duration: 2116.9060339927673
failure_reason: '"2012-11-18 12:01:23.839225 osd.0 10.214.131.14:6800/11897 33 : [ERR]
  3.7 osd.2: soid 3ff93a97/rbd_header.1016180d79cd/head//3 attr value mismatch _" 
  in cluster log'
flavor: basic
mds.a-kernel-sha1: 22cddde104d715600a4c218bf9224923208afe90
mon.a-kernel-sha1: 22cddde104d715600a4c218bf9224923208afe90
owner: scheduled_sage@nine
success: false

#9 Updated by Tamilarasi muthamizhan over 11 years ago

Logs: ubuntu@teuthology:/a/sage-2012-11-25_20:49:20-regression-next-master-basic/4280

#10 Updated by Sage Weil over 11 years ago

  • Status changed from In Progress to 7

#11 Updated by Samuel Just over 11 years ago

This may have been fixed by 5c8cbd28207195b094799a7bdbad0019669682a8.

#12 Updated by Sage Weil over 11 years ago

  • Status changed from 7 to Resolved

#13 Updated by Tamilarasi muthamizhan over 11 years ago

  • Status changed from Resolved to In Progress

recent log: ubuntu@teuthology:/a/teuthology-2012-12-13_19:00:03-regression-next-testing-basic/13809

ubuntu@teuthology:/a/teuthology-2012-12-13_19:00:03-regression-next-testing-basic/13809$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: 2978257c56935878f8a756c6cb169b569e99bb91
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 5000
    fs: xfs
    log-whitelist:
    - slow request
    sha1: 8cf367cb79046b08cc593b14f77526eef2758ee6
  s3tests:
    branch: next
  workunit:
    sha1: 8cf367cb79046b08cc593b14f77526eef2758ee6
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana43.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC/sM+tk86Dl9EeTk49zN9mfwYFfwJcYKVIdEnfFpNie9+cM+X1SSorAP+J08fzKa5P4S44izIiAG8bVHzWpWg667ks1FUBNXaUuJQrd2gUU3VFCBgx/sZeWrc7ShUtCTKYImrfIvXEemAc65bQKga10StnfOtZy+NgfwOJb7S05RBnGzfzqtAU7Ny+SEjZcu+80/uIOHWlPwxU4/nkOUEVKzGg77a9e5vrg49MuKXRr8aF03+gTEc2WBKXkpMCHlIU0tB96QN+vdCHjz9gIcZ7aq+3SN6KRSVEoWd2CwYwOTpRHrmYpFQG7zkZkJxRDQ17QXNfg4v/CRfQzEs0MpJt
  ubuntu@plana73.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCxoJnvRI1V0OJuQI9SosOedC7mj9O627LjoQWPKilJiBbHduPe1byBaKrgwTeEghl43VNf+EBs1+MwVH7zlDolnwN4tAlW9bRpC2SzURJfhZskp2CSQY3l8ca7a5f0J3hdOhx47oSSapN7O2cqmPzwlL/+MrFKGi+ITT613nUtzCjduZRPdhjyqZ0cQWeb0p1neDw5hbDBKd+HAH+ek/E6DK2PaqN6YAtmIgP76q0fQ85Omd0oDlmGXpKe3jlxlPT0W/5KD1+mpobPsh/EF2qar7IG/WqHHJ6NZAcXbdZ4KiMf9erP+Pk4KkD5SJ+e3GF7OEOwXtahKIIR1An4P2GD
  ubuntu@plana75.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC/sBKIbaWlkUEbStD0wYVUj2aEuiP8WB0B4h4oyzOJaWaKSTPAK2hzAxEDVOkG1JhpR2JrfXitDtA7MW48NvP77Ov/EvOnTHBeTE7mvWL0D2d4/YUoqhF+RLojHgFNOE0FsVEc/2rhARYX9/4VL5YQ1kaE4dKeRqLxn/eA6BoW5+NDbdQ1Bt6qWNSTXYC2qs09do6wUXHbB+KE1Obay4QTGf77QA+ueVnAnKmYym5c5kGMqb7DD+I/OZyUcOWTCQ4sDpo2nh0GpHATqAAWXeFMSpJ0sVQmR5ByTpKsoRV3QxmxlNHBJVDrBoGbw7O0z8AisuwOfqzrOO5M3Q+16Gen
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- radosbench:
    clients:
    - client.0
    time: 1800
ubuntu@teuthology:/a/teuthology-2012-12-13_19:00:03-regression-next-testing-basic/13809$ cat summary.yaml 
ceph-sha1: 8cf367cb79046b08cc593b14f77526eef2758ee6
client.0-kernel-sha1: 2978257c56935878f8a756c6cb169b569e99bb91
description: collection:rados-thrash clusters:6-osd-3-machine.yaml fs:xfs.yaml msgr-failures:few.yaml
  thrashers:default.yaml workloads:radosbench.yaml
duration: 4775.77400302887
failure_reason: '"2012-12-14 04:12:49.462277 osd.0 10.214.132.35:6800/16930 472 :
  [ERR] 0.c scrub stat mismatch, got 42/44 objects, 0/0 clones, 176160768/184549376
  bytes." in cluster log'
flavor: basic
mds.a-kernel-sha1: 2978257c56935878f8a756c6cb169b569e99bb91
mon.a-kernel-sha1: 2978257c56935878f8a756c6cb169b569e99bb91
owner: scheduled_teuthology@teuthology
success: false

#14 Updated by Sage Weil over 11 years ago

  • Status changed from In Progress to 7

#15 Updated by Sage Weil over 11 years ago

  • Status changed from 7 to Resolved

Also available in: Atom PDF