Project

General

Profile

Actions

Bug #57657

closed

mds: scrub locates mismatch between child accounted_rstats and self rstats

Added by Patrick Donnelly over 1 year ago. Updated 11 months ago.

Status:
Resolved
Priority:
High
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Q/A
Tags:
backport_processed
Backport:
quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
qa-failure, scrub
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2022-09-22T13:11:47.201+0000 7f83d76d9700 20 mds.0.scrubstack dequeue [inode 0x10000000001 [...2,head] /parent/flushed/ auth v14 ap=1 f(v0 m2022-09-22T13:11:00.272150+0000 3=3+0) n(v0 rc2022-09-22T13:11:00.274637+0000 b17 4=3+1) (ifile excl) (iversion lock) | request=0 lock=0 dirfrag=1 caps=0 authpin=1 scrubqueue=1 0x55fcb96ab180] from ScrubStack
2022-09-22T13:11:47.201+0000 7f83d76d9700 10 mds.0.scrubstack scrub_dirfrag [dir 0x10000000001 /parent/flushed/ [2,head] auth v=14 cv=13/13 ap=1+0 state=1610612737|complete f(v0 m2022-09-22T13:11:00.272150+0000 3=3+0) n(v0 rc2022-09-22T13:11:00.274637+0000 b17 3=3+0) hs=2+0,ss=0+0 | child=1 dirty=1 waiter=0 authpin=1 scrubqueue=1 0x55fcb96bd180]
2022-09-22T13:11:47.201+0000 7f83d76d9700 20 mds.0.cache.den(0x10000000001 charlie) scrubbing [dentry #0x1/parent/flushed/charlie [2,head] auth (dversion lock) pv=0 v=13 ino=0x100000001f8 state=1073741824 0x55fcb96aea00] next_seq = 2
2022-09-22T13:11:47.201+0000 7f83d76d9700 10  mds.0.cache.snaprealm(0x1 seq 1 0x55fcb969a400) get_snaps  (seq 1 cached_seq 1)
2022-09-22T13:11:47.201+0000 7f83d76d9700 10 mds.0.scrubstack _enqueue with {[inode 0x100000001f8 [2,head] /parent/flushed/charlie auth v12 s=7 n(v0 rc2022-09-22T13:11:00.274637+0000 b7 1=1+0) (iversion lock) 0x55fcb96ab700]}, top=0
2022-09-22T13:11:47.201+0000 7f83d76d9700 20 mds.0.cache.ino(0x100000001f8) scrub_initialize with scrub_version 12
2022-09-22T13:11:47.201+0000 7f83d76d9700 20 mds.0.scrubstack enqueue [inode 0x100000001f8 [2,head] /parent/flushed/charlie auth v12 s=7 n(v0 rc2022-09-22T13:11:00.274637+0000 b7 1=1+0) (iversion lock) 0x55fcb96ab700] to bottom of ScrubStack
2022-09-22T13:11:47.201+0000 7f83d76d9700 20 mds.0.cache.den(0x10000000001 alpha) scrubbing [dentry #0x1/parent/flushed/alpha [2,head] auth (dversion lock) pv=0 v=13 ino=0x10000000002 state=1073741824 0x55fcb96aec80] next_seq = 2
2022-09-22T13:11:47.201+0000 7f83d76d9700 10  mds.0.cache.snaprealm(0x1 seq 1 0x55fcb969a400) get_snaps  (seq 1 cached_seq 1)
2022-09-22T13:11:47.201+0000 7f83d76d9700 10 mds.0.scrubstack _enqueue with {[inode 0x10000000002 [2,head] /parent/flushed/alpha auth v8 s=5 n(v0 rc2022-09-22T13:11:00.267473+0000 b5 1=1+0) (iversion lock) 0x55fcb96bf600]}, top=0
2022-09-22T13:11:47.201+0000 7f83d76d9700 20 mds.0.cache.ino(0x10000000002) scrub_initialize with scrub_version 8
2022-09-22T13:11:47.201+0000 7f83d76d9700 20 mds.0.scrubstack enqueue [inode 0x10000000002 [2,head] /parent/flushed/alpha auth v8 s=5 n(v0 rc2022-09-22T13:11:00.267473+0000 b5 1=1+0) (iversion lock) 0x55fcb96bf600] to bottom of ScrubStack
2022-09-22T13:11:47.201+0000 7f83d76d9700  1 mds.0.cache.dir(0x10000000001) mismatch between head items and fnode.fragstat! printing dentries
2022-09-22T13:11:47.201+0000 7f83d76d9700  1 mds.0.cache.dir(0x10000000001) get_num_head_items() = 2; fnode.fragstat.nfiles=3 fnode.fragstat.nsubdirs=0
2022-09-22T13:11:47.201+0000 7f83d76d9700  1 mds.0.cache.dir(0x10000000001) mismatch between child accounted_rstats and my rstats!
2022-09-22T13:11:47.201+0000 7f83d76d9700  1 mds.0.cache.dir(0x10000000001) total of child dentries: n(v0 rc2022-09-22T13:11:00.274637+0000 b12 2=2+0)
2022-09-22T13:11:47.201+0000 7f83d76d9700  1 mds.0.cache.dir(0x10000000001) my rstats:              n(v0 rc2022-09-22T13:11:00.274637+0000 b17 3=3+0)
2022-09-22T13:11:47.201+0000 7f83d76d9700 10 mds.0.cache.dir(0x10000000001) check_rstats complete on 0x55fcb96bd180
2022-09-22T13:11:47.201+0000 7f83d76d9700  0 log_channel(cluster) log [WRN] : Scrub error on dir 0x10000000001 (/parent/flushed) see mds.a log and `damage ls` output for details
2022-09-22T13:11:47.201+0000 7f83d76d9700 20 mds.0.cache.dir(0x10000000001) scrub_finished

From: /ceph/teuthology-archive/pdonnell-2022-09-22_12:22:37-fs-wip-pdonnell-testing-20220920.234701-distro-default-smithi/7041083/remote/smithi110/log/ceph-mds.a.log.gz

This is with the new postgres/scrub/snap changes to fs:workload. What's particularly relevant to this change is that scrub now properly reports damage of dirfrags where check_rstats locates damage. It did not before.


Related issues 2 (0 open2 closed)

Copied to CephFS - Backport #57714: pacific: mds: scrub locates mismatch between child accounted_rstats and self rstatsResolvedPatrick DonnellyActions
Copied to CephFS - Backport #57715: quincy: mds: scrub locates mismatch between child accounted_rstats and self rstatsResolvedPatrick DonnellyActions
Actions #1

Updated by Patrick Donnelly over 1 year ago

  • Description updated (diff)
Actions #2

Updated by Venky Shankar over 1 year ago

  • Category set to Correctness/Safety
  • Status changed from New to Triaged
  • Assignee set to Patrick Donnelly
Actions #3

Updated by Patrick Donnelly over 1 year ago

During standup I was thinking of something else. This test deliberately creates this kind of damage by manually deleting dentries. I think we can just ignorelist this error for this test.

Actions #4

Updated by Patrick Donnelly over 1 year ago

  • Status changed from Triaged to Fix Under Review
  • Pull request ID set to 48257
Actions #5

Updated by Venky Shankar over 1 year ago

Patrick Donnelly wrote:

During standup I was thinking of something else. This test deliberately creates this kind of damage by manually deleting dentries. I think we can just ignorelist this error for this test.

Makes sense.

Actions #6

Updated by Venky Shankar over 1 year ago

  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Backport Bot over 1 year ago

  • Copied to Backport #57714: pacific: mds: scrub locates mismatch between child accounted_rstats and self rstats added
Actions #8

Updated by Backport Bot over 1 year ago

  • Copied to Backport #57715: quincy: mds: scrub locates mismatch between child accounted_rstats and self rstats added
Actions #9

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #10

Updated by Patrick Donnelly 11 months ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF