Project

General

Profile

Actions

Bug #2080

closed

osd: scrub on disk size does not match object info size

Added by Sage Weil about 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2012-02-18 00:35:53.295447 osd.1 10.3.14.135:6800/5984 22 : [ERR] scrub 0.2 8ea1d16a/100000000f3.00000002/head on disk size (632594) does not match object info size (8386523)
2012-02-18 00:35:53.295609 osd.1 10.3.14.135:6800/5984 23 : [ERR] 0.2 scrub stat mismatch, got 76/76 objects, 0/0 clones, 232766220/240520149 bytes.
2012-02-18 00:35:53.295627 osd.1 10.3.14.135:6800/5984 24 : [ERR] 0.2 scrub 2 errors
2012-02-18 00:36:23.490098 osd.1 10.3.14.135:6800/5984 26 : [ERR] scrub 0.6 c3f8966e/10000000048.00000001/head on disk size (153512) does not match object info size (737337)
2012-02-18 00:36:23.490272 osd.1 10.3.14.135:6800/5984 27 : [ERR] 0.6 scrub stat mismatch, got 73/73 objects, 0/0 clones, 206814172/207397997 bytes.
2012-02-18 00:36:23.490291 osd.1 10.3.14.135:6800/5984 28 : [ERR] 0.6 scrub 2 errors

ubuntu@teuthology:/a/nightly_coverage_2012-02-18-a/12448

  kernel:
    sha1: 07fd42934a53b8486709f7f866346a9e4bb6d5ce
  nuke-on-error: true
  overrides:
    ceph:
      conf:
        osd:
          osd op complaint time: 120
      coverage: true
      fs: btrfs
      log-whitelist:
      - clocks not synchronized
      - old request
      sha1: c1db9009c2cde9dc7ab8857b0d28a1b6d931e98a
  roles:
  - - mon.a
    - mon.c
    - osd.0
  - - mon.b
    - mds.a
    - osd.1
  - - client.0
  tasks:
  - chef: null
  - ceph: null
  - kclient: null
  - workunit:
      all:
      - suites/fsstress.sh

Actions #1

Updated by Sage Weil about 12 years ago


ubuntu@teuthology:/a/nightly_coverage_2012-02-19-b/12708$ cat config.yaml 
kernel: &id001
  sha1: cc050a5dab7beb893d72dbd41e8aa5c07f1d0427
nuke-on-error: true
overrides:
  ceph:
    conf:
      osd:
        osd op complaint time: 120
    coverage: true
    fs: btrfs
    log-whitelist:
    - clocks not synchronized
    - old request
    sha1: ff5178c86a043a26ea8523dd5ac24462a8472fd4
roles:
- - mon.a
  - mon.c
  - osd.0
- - mon.b
  - mds.a
  - osd.1
- - client.0
targets:
  ubuntu@sepia47.ceph.dreamhost.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPS2bMq42vKt+yPhbfxq6UiMbfaYts2bX6+VCjymOITUyHUBm6UGU0jFf0hQiCV7FNW2j1DPNSsExkVDoZe7a87gKjOIIO2TE3hCXWezdmTW29NE5v04zYR0H7r3jt9SDA/WN+4n/TIZh4ENuNuhZL5xO/ni3OR7hVRaMM9POZ8crUenXYpovASq5FXzNHrizA7Pyt/QADqCG6RzWAP7yFhl6AaeqQCksPHCBIP8f6PVrMpqJBSbFODVn8uKOTiKi2z0RleUMGtf4vk03eizie7InRUddnE0K7vrcvy0N4Vv7BWpDGffjvO8E5ocLKMkG0zl/Ckx5OKkBxYAoUr9OB
  ubuntu@sepia48.ceph.dreamhost.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCvjFqmoGruY92UzwTdgFqMchSs1Ted0MXgO1SZqYhg6ib/+NSIXysfRQDgrUI5SUffxkLEDYQ6hDxgmTPYmGWcGXeKuCob3e+Tc39dHfdO8L3kPQLqCQ9zQlmFHSvbWzukojnfbm/EMbnnylBddfZxG8Wp0upXHj9AwQgpfzm5U6G/R8IXmpxp/3hRHNxTf8fIukoYCsauF2dqKhcUdhorvS3DrNH6sH3ibjzpcGtuOx4Lm6bN40mCrkjTh4gLhO1weNVJmgEvk8pIQgxdjquIHgFXG+wfkjEp1MCMRbjVzvUgm5LtXg5CmwIs7Rzg+MwexbtE1JLTANa5YbXZKykL
  ubuntu@sepia8.ceph.dreamhost.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7v/n2qEF1rouwm/h0O7siO2+P5uPiEHVeCKdCYSubxFjCBfJKn8qUSsPP4PaeIPvaaWUbe3kxy87LXnhUNmojio4b3bGRnGksPsDne6Zc/olR0Ivzn0wBVU0TUs0VU8Rj+MS1HSCKNm7gGQhN1BqKSCNPNsfH0Dx15dKZfQyRVBpaFMuUAhvKaG2p2NudpX0xVLm/Cwt6u8GcgPMyykNLGDj/q6qVm71InykGBnwaMW8AEHi4S7zKUuOBxVeZLhB9BhrvP06GRc3kCjHAxIX+6zhTL+vLE11uVJbovV0pnjbgw6bcyzJfhcQWyhZTEaTXnpoD7mvvpbp2dPJ+DZHf
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- chef: null
- ceph: null
- kclient: null
- workunit:
    all:
    - suites/fsstress.sh

2012-02-19 14:08:12.268187 osd.1 10.3.14.135:6800/13953 22 : [ERR] scrub 0.2 baf62b7a/10000000196.00000002/head on disk size (67716) does not match object info size (3971586)
2012-02-19 14:08:12.268302 osd.1 10.3.14.135:6800/13953 23 : [ERR] 0.2 scrub stat mismatch, got 55/55 objects, 0/0 clones, 148066528/151970398 bytes.
2012-02-19 14:08:12.268320 osd.1 10.3.14.135:6800/13953 24 : [ERR] 0.2 scrub 2 errors
2012-02-19 14:08:13.378401 osd.0 10.3.14.175:6800/31342 18 : [INF] 0.3 scrub ok
2012-02-19 14:08:15.059119 osd.1 10.3.14.135:6800/13953 25 : [INF] 0.4 scrub ok
2012-02-19 14:08:17.678341 osd.1 10.3.14.135:6800/13953 26 : [ERR] scrub 0.6 7ca25366/1000000026b.00000001/head on disk size (313526) does not match object info size (3850701)
2012-02-19 14:08:17.678412 osd.1 10.3.14.135:6800/13953 27 : [ERR] 0.6 scrub stat mismatch, got 74/74 objects, 0/0 clones, 190300185/193837360 bytes.
2012-02-19 14:08:17.678430 osd.1 10.3.14.135:6800/13953 28 : [ERR] 0.6 scrub 2 errors
2012-02-19 14:08:29.963729 osd.1 10.3.14.135:6800/13953 29 : [INF] 0.7 scrub ok
2012-02-19 14:08:44.832317 osd.1 10.3.14.135:6800/13953 30 : [INF] 0.0p1 scrub ok
2012-02-19 14:08:40.387294 osd.0 10.3.14.175:6800/31342 19 : [ERR] scrub 0.5 3e7d6035/1000000007d.00000001/head on disk size (977444) does not match object info size (3449808)
2012-02-19 14:08:40.387326 osd.0 10.3.14.175:6800/31342 20 : [ERR] 0.5 scrub stat mismatch, got 63/63 objects, 0/0 clones, 195084488/197556852 bytes.
Actions #2

Updated by Sage Weil about 12 years ago

ubuntu@teuthology:/a/master-2012-02-19_19:50:05/12884

Actions #3

Updated by Sage Weil about 12 years ago

reproduced with log. metropolis:~sage/bug-2080

Actions #4

Updated by Sage Weil about 12 years ago

  • Status changed from New to 7
Actions #5

Updated by Sage Weil about 12 years ago

  • Status changed from 7 to Resolved
Actions #6

Updated by Sage Weil about 12 years ago

  • Status changed from Resolved to 12
  • Target version deleted (v0.43)
  • Source set to Q/A

hit this again,

ubuntu@teuthology:/a/nightly_coverage_2012-02-29-a/14342$ grep ERR ceph.log 
2012-02-29 10:51:01.787384 osd.0 10.3.14.153:6800/17832 1 : [ERR] scrub 0.0 57ab7610/10000000012.00000001/head on disk size (703432) does not match object info size (3814197)
2012-02-29 10:51:01.787406 osd.0 10.3.14.153:6800/17832 2 : [ERR] 0.0 scrub stat mismatch, got 9/9 objects, 0/0 clones, 22528608/25639373 bytes.
2012-02-29 10:51:01.787430 osd.0 10.3.14.153:6800/17832 3 : [ERR] 0.0 scrub 2 errors
2012-02-29 10:51:09.122947 osd.0 10.3.14.153:6800/17832 5 : [ERR] scrub 0.3 efa1f9d3/1000000000b.00000001/head on disk size (117755) does not match object info size (2447690)
2012-02-29 10:51:09.122989 osd.0 10.3.14.153:6800/17832 6 : [ERR] 0.3 scrub stat mismatch, got 15/15 objects, 0/0 clones, 32470639/34800574 bytes.
2012-02-29 10:51:09.123007 osd.0 10.3.14.153:6800/17832 7 : [ERR] 0.3 scrub 2 errors

kernel: &id001
  sha1: cc050a5dab7beb893d72dbd41e8aa5c07f1d0427
nuke-on-error: true
overrides:
  ceph:
    conf:
      osd:
        osd op complaint time: 120
    coverage: true
    fs: btrfs
    log-whitelist:
    - clocks not synchronized
    - old request
    sha1: fe94c0414e43d87095272544d021f5ee52e1aa5c
roles:
- - mon.a
  - mon.c
  - osd.0
- - mon.b
  - mds.a
  - osd.1
- - client.0
targets:
  ubuntu@sepia19.ceph.dreamhost.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7Dw7FUqWo1A6Se/AYEnyytTHxyfe5z3g/LTm0mrLqDJU0VjfNgpRwtbSXAxzAFqci/WjZLg1TYSVL81mnXqjClcLqQmR6TpYVAKUdL+ob0j11lNtfjHeO4h0YFE2FzxWIUU3fzPw++LbwoU3B9SuTICHtpCwhv/begOTwc0T4mEkF4QxBunsgV6slBajb7TgtCcclcESuF8P4AUZVozm0Mpzcj+OtCk2thLMEqhk4287JYcXkwfQrsJHSWj5mjm0gVdig5Il81ghGll+j+TFbPuOzurWstH7F2GXfQ98tNqyru1FV5GQ3LOzICofzkqERqPFFgmvw74RMDA1iV1R5
  ubuntu@sepia25.ceph.dreamhost.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDp0ZfvGMkEvpyrHDzChOjAfAyrzILwMZU9rOYCkGAbNvYkjbdnBi7IHk8wNO7uZjSnPeFI1GD7kJGUqwhrRJX+/Kbep9U/OFgvtiKKyK09pnyqs/4Ovivudv4XtgOgPd9jOIUTKRpHHkNH339TihCjp5Hvqvd+YmjKM5noEe/iMuSxzjdYDVNos0O5Y1qTG+7TXrCHRhZ6quVcieabFtPo1WV1woc84fG7yKoiVtAoGWYnZFS6ao+ZQy7Kns5JXB7zJ5znb8oCC28G5/9g13Kk3tHl6fT96s7YrBjYPIvUNCu06Wg8wJJV5jRGrlHuZnc+J1Cw+x529k9w0RZYEmXt
  ubuntu@sepia26.ceph.dreamhost.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3m5OkzR6abEizkIEVzDfeeYiQeNFGxEMKFyQ7o8RGAXVzx9iGhGP5l+7QAkbIYbQ04WGmIhJAt4ElroLHiUgLGt/ES/9zx0wMOt57emNW4twVYl98G/Ug4tC8dZMeSsSUzUgQR9QWR7UTBTx1yjr8NTy5/sXDkeIujTckQPo6wZUZgJw7K+a22AygFFLgqXB1NmsM8ZgzyBj22HjI4kGpbDsabByBnfqrG1OJFAImEtbfoTOPq4zKGNMpp1oWPbHbyC7uv4QU9eOX3HR9NVaUVTNMSFrrvTl33umtbHSai12/BVXX0fuvGUgG8/vkZ1MtO5kkvHYWeeEZ0zRtY45j
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- ceph: null
- kclient: null
- workunit:
    all:
    - suites/fsstress.sh


it's a consistent workload (kclient fsstress) that triggers it!

Actions #7

Updated by Sage Weil about 12 years ago

  • Status changed from 12 to 7
  • Assignee set to Sage Weil

wip-2080

Actions #8

Updated by Sage Weil about 12 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF