Project

General

Profile

Actions

Bug #1928

closed

osd: scrub stat mismatch after fsstress on kernel client

Added by Josh Durgin over 12 years ago. Updated over 12 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

6952: collection:basic btrfs:with-btrfs.yaml clusters:fixed-3.yaml tasks:kclient_workunit_suites_fsstress.yaml
    "2012-01-11 14:10:40.988122 osd.1 10.3.14.183:6800/6839 4 : [ERR] 0.7 scrub stat mismatch, got 2/2 objects, 0/0 clones, 4719427/8388608 bytes, 4609/8192 kb." in cluster log
Actions #1

Updated by Samuel Just over 12 years ago

One possibility: in CEPH_OSD_OP_WRITE in ReplicatedPG::do_op we pass op.extent.offset and op.extent.length to write_update_size_and_usage unconditionally. However, if op.extent.length is 0 (e.g. a 0 length write to offset 4MB) we touch the file rather than doing a write. This could cause oi.size to not match the on disk size. Checking to see whether this workload could have done something like this.

Actions #2

Updated by Samuel Just over 12 years ago

It seems that fstress will do that: 2012-01-11T14:30:04.867 INFO:teuthology.task.workunit.client.0.out:8/17: dwrite f7 [4194304,4194304] 0

Actions #3

Updated by Sage Weil over 12 years ago

Samuel Just wrote:

It seems that fstress will do that: 2012-01-11T14:30:04.867 INFO:teuthology.task.workunit.client.0.out:8/17: dwrite f7 [4194304,4194304] 0

I'm surprised a 0 byte write made it over the wire...

Actions #4

Updated by Samuel Just over 12 years ago

  • Status changed from New to Closed

4815cafddf46e968501ac3b96e593c5e8db6218b

Actions #5

Updated by Josh Durgin over 12 years ago

  • Status changed from Closed to 12

Looks like the fixes for this introduced a new bug - 3 runs so far today failed with a similar scrub stat mismatch:

7410 FAIL scheduled_teuthology@teuthology collection:basic btrfs:with-btrfs.yaml clusters:fixed-3.yaml tasks:kclient_workunit_kernel_untar_build.yaml
    "2012-01-13 13:08:26.814703 osd.1 10.3.14.201:6800/4411 4 : [ERR] 0.7 scrub stat mismatch, got 1/1 objects, 0/0 clones, 331776/-1 bytes, 324/0 kb." in cluster log
7412 FAIL scheduled_teuthology@teuthology collection:basic btrfs:with-btrfs.yaml clusters:fixed-3.yaml tasks:kclient_workunit_suites_ffsb.yaml
    "2012-01-13 13:37:52.571962 osd.0 10.3.14.151:6800/13888 16 : [ERR] 0.0 scrub stat mismatch, got 427/427 objects, 0/0 clones, 1749094674/-427 bytes, 1708105/0 kb." in cluster log
7414 FAIL scheduled_teuthology@teuthology collection:basic btrfs:with-btrfs.yaml clusters:fixed-3.yaml tasks:kclient_workunit_suites_iozone.yaml
    "2012-01-13 13:30:20.602537 osd.0 10.3.14.132:6800/9539 4 : [ERR] 0.5 scrub stat mismatch, got 25/25 objects, 0/0 clones, 104857600/-25 bytes, 102400/0 kb." in cluster log

Actions #6

Updated by Sage Weil over 12 years ago

  • Status changed from 12 to Resolved
Actions

Also available in: Atom PDF