Bug #16913: multimds: OSD deep scrub failure - Ceph - Ceph

Actions

Copy link

Bug #16913

closed

multimds: OSD deep scrub failure

Added by Patrick Donnelly over 7 years ago. Updated almost 7 years ago.

Status:

Can't reproduce

Priority:

High

Assignee:

Category:

Target version:

% Done:

Source:

Development

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

http://pulpito.ceph.com/pdonnell-2016-07-29_08:28:00-multimds-master---basic-mira/339923/

Failure: "2016-07-30 03:35:51.759582 osd.3 172.21.9.130:6800/31765 2 : cluster [ERR] deep-scrub 2.8 2:1e20cbdb:::10000000016.00000000:head on disk size (785731) does not match object info size (2142056) adjusted for ondisk to (2142056)" in cluster log
1 jobs: ['339923']
suites: ['clusters/9-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/yes.yaml', 'mount/cfuse.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_fsstress.yaml}']

http://pulpito.ceph.com/pdonnell-2016-07-21_13:20:27-multimds-master---basic-mira/327535/

Failure: "2016-07-24 03:54:38.649923 osd.1 172.21.4.128:6804/13269 4 : cluster [ERR] deep-scrub 2.5 2:a2f9f58c:::20000000005.00000001:head on disk size (0) does not match object info size (2058217) adjusted for ondisk to (2058217)" in cluster log
1 jobs: ['327535']
suites: ['clusters/9-mds.yaml', 'debug/mds_client.yaml', 'fs/btrfs.yaml', 'inline/no.yaml', 'mount/cfuse.yaml', 'multimds/basic/{ceph/base.yaml', 'overrides/whitelist_wrongly_marked_down.yaml', 'tasks/suites_fsstress.yaml}']

Actions

Copy link

Updated by John Spray over 7 years ago

Project changed from CephFS to Ceph
Category deleted (93)
Assignee set to Samuel Just

Sam, any known bugs like this in the OSD on master?

Actions

Copy link

Updated by Samuel Just over 7 years ago

Assignee deleted (~~Samuel Just~~)
Priority changed from Normal to High

Not that I know of, how often is it happening?

Actions

Copy link

Updated by Samuel Just over 7 years ago

Priority changed from High to Urgent

Actions

Copy link

Updated by Patrick Donnelly over 7 years ago

I've only seen this happen in this run of the new multimds suite [1]. I'm in the process of making a bunch of changes to that PR so I won't be able to do another run of multimds for a week or two. Then maybe we'll know if this can be consistently reproduced.

[1] https://github.com/ceph/ceph-qa-suite/pull/1114

Actions

Copy link