Project

General

Profile

Actions

Bug #16913

closed

multimds: OSD deep scrub failure

Added by Patrick Donnelly over 7 years ago. Updated almost 7 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://pulpito.ceph.com/pdonnell-2016-07-29_08:28:00-multimds-master---basic-mira/339923/

Failure: "2016-07-30 03:35:51.759582 osd.3 172.21.9.130:6800/31765 2 : cluster [ERR] deep-scrub 2.8 2:1e20cbdb:::10000000016.00000000:head on disk size (785731) does not match object info size (2142056) adjusted for ondisk to (2142056)" in cluster log
1 jobs: ['339923']
suites: ['clusters/9-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/yes.yaml', 'mount/cfuse.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_fsstress.yaml}']

http://pulpito.ceph.com/pdonnell-2016-07-21_13:20:27-multimds-master---basic-mira/327535/

Failure: "2016-07-24 03:54:38.649923 osd.1 172.21.4.128:6804/13269 4 : cluster [ERR] deep-scrub 2.5 2:a2f9f58c:::20000000005.00000001:head on disk size (0) does not match object info size (2058217) adjusted for ondisk to (2058217)" in cluster log
1 jobs: ['327535']
suites: ['clusters/9-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/cfuse.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_fsstress.yaml}']

Actions #1

Updated by John Spray over 7 years ago

  • Project changed from CephFS to Ceph
  • Category deleted (93)
  • Assignee set to Samuel Just

Sam, any known bugs like this in the OSD on master?

Actions #2

Updated by Samuel Just over 7 years ago

  • Assignee deleted (Samuel Just)
  • Priority changed from Normal to High

Not that I know of, how often is it happening?

Actions #3

Updated by Samuel Just over 7 years ago

  • Priority changed from High to Urgent
Actions #4

Updated by Patrick Donnelly over 7 years ago

I've only seen this happen in this run of the new multimds suite [1]. I'm in the process of making a bunch of changes to that PR so I won't be able to do another run of multimds for a week or two. Then maybe we'll know if this can be consistently reproduced.

[1] https://github.com/ceph/ceph-qa-suite/pull/1114

Actions #5

Updated by Sage Weil over 7 years ago

  • Status changed from New to Need More Info
Actions #6

Updated by Samuel Just over 7 years ago

  • Priority changed from Urgent to High

Demoting to high until it's reproducible

Actions #7

Updated by Josh Durgin almost 7 years ago

Patrick, has this occurred again?

Actions #8

Updated by Patrick Donnelly almost 7 years ago

No, I haven't seen it anymore.

Actions #9

Updated by Josh Durgin almost 7 years ago

  • Status changed from Need More Info to Can't reproduce

Please reopen if it happens again.

Actions

Also available in: Atom PDF