Project

General

Profile

Bug #43068

on disk size (81292) does not match object info size (81237)

Added by Xiaoxi Chen about 1 year ago. Updated 4 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

2019-11-30 18:19:18.041453 osd.40 [ERR] 2.483 shard 40 soid 2:c12d31e6:::10002dececc.00000000:head : candidate size 81292 info size 81237 mismatch
2019-11-30 18:19:18.041456 osd.40 [ERR] 2.483 shard 408 soid 2:c12d31e6:::10002dececc.00000000:head : candidate size 81292 info size 81237 mismatch
2019-11-30 18:19:18.041459 osd.40 [ERR] 2.483 shard 989 soid 2:c12d31e6:::10002dececc.00000000:head : candidate size 81292 info size 81237 mismatch
2019-11-30 18:19:18.041460 osd.40 [ERR] 2.483 soid 2:c12d31e6:::10002dececc.00000000:head : failed to pick suitable object info
2019-11-30 18:19:18.041511 osd.40 [ERR] repair 2.483 2:c12d31e6:::10002dececc.00000000:head : on disk size (81292) does not match object info size (81237) adjusted for ondisk to (81237)
2019-11-30 18:20:44.520814 osd.40 [ERR] 2.483 repair 4 errors, 0 fixed

Though rados get returns an object with 81237, and this object happen to be a cephfs file/object, the inode also suggesting 81237.

I believe we can walk round by put the object back, but want to go deeper why bluestore return wront on disk size.

root@HOST:~# ceph daemon /var/run/ceph/HOST-mds.`hostname -s`.asok dump inode 0x10002dececc
{
    "path": "MASK_OUT",
    "ino": 1099559784140,
    "rdev": 0,
    "ctime": "2019-11-30 08:05:30.682443",
    "btime": "2019-07-23 17:56:02.934971",
    "mode": 33152,
    "uid": 741,
    "gid": 747,
    "nlink": 1,
    "dir_layout": {
        "dir_hash": 0
    },
    "layout": {
        "stripe_unit": 4194304,
        "stripe_count": 1,
        "object_size": 4194304,
        "pool_id": 2,
        "pool_ns": "" 
    },
    "old_pools": [],
    "size": 81237,
root@rnoaz01cephmon01:~# stat ./10002dececc.00000000
  File: './10002dececc.00000000'
  Size: 81237         Blocks: 160        IO Block: 4096   regular file
Device: fd01h/64769d    Inode: 10340       Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2019-11-30 18:25:17.436609879 -0700
Modify: 2019-11-30 18:25:17.868611056 -0700
Change: 2019-11-30 18:25:17.868611056 -0700
 Birth: -

hexdiff.png View (647 KB) Xiaoxi Chen, 12/01/2019 04:23 PM

History

#2 Updated by Xiaoxi Chen about 1 year ago

I dump the on-disk object via objectstore tool , seems the last 55 bytes are all zero...

This file is a bash_history of one account so it shouldn't have the tiling zero.

https://github.com/ceph/ceph/pull/30085 cannot explain the case, as the object info is right however content seems wrong.

#3 Updated by Nathan Cutler about 1 year ago

  • Project changed from CephFS to bluestore
  • Subject changed from on disk size (81292) does not match object info size (81237) to on disk size (81292) does not match object info size (81237)
  • Category deleted (Correctness/Safety)

#4 Updated by Sage Weil 10 months ago

  • Status changed from New to Need More Info

was this bluestore or filestore?

#5 Updated by Xiaoxi Chen 8 months ago

bluestore

#6 Updated by Xiaoxi Chen 8 months ago

  • Status changed from Need More Info to New

we get one more hit

2020-04-16 14:21:30.139493 osd.869 (osd.869) 10192 : cluster [ERR] 6.68b shard 869 soid 6:d16f3dcf:::7000a37a4ae.00000000:head : candidate size 1222 info size 1204 mismatch
2020-04-16 14:21:30.139504 osd.869 (osd.869) 10193 : cluster [ERR] 6.68b shard 560 soid 6:d16f3dcf:::7000a37a4ae.00000000:head : candidate size 1222 info size 1204 mismatch
2020-04-16 14:21:30.139507 osd.869 (osd.869) 10194 : cluster [ERR] 6.68b shard 671 soid 6:d16f3dcf:::7000a37a4ae.00000000:head : candidate size 1222 info size 1204 mismatch
2020-04-16 14:21:30.139514 osd.869 (osd.869) 10195 : cluster [ERR] 6.68b soid 6:d16f3dcf:::7000a37a4ae.00000000:head : failed to pick suitable object info
2020-04-16 14:21:30.139639 osd.869 (osd.869) 10196 : cluster [ERR] repair 6.68b 6:d16f3dcf:::7000a37a4ae.00000000:head : on disk size (1222) does not match object info size (1204) adjusted for ondisk to (1204)

#7 Updated by Igor Fedotov 8 months ago

It looks to me that this bug has rather out-of-bluestore scope. Am I right it applies to CephFS files/objects only? Anything specific/common (e.g. some use patterns or this applies to bash history file only or whatever) about broken files?
And is this correct that Rados/BlueStore is valid (and is in line with the content) but CephFS size is higher?

If so may be make CephFS team aware by moving the ticket to the corresponding project for further troubleshooting...?

Ideally it would be to find a reproduction of it and collect more verbose logging...

#8 Updated by Patrick Donnelly 7 months ago

Igor Fedotov wrote:

And is this correct that Rados/BlueStore is valid (and is in line with the content) but CephFS size is higher?

From my reading, it looks like the bluestore object info and the MDS agree on the object size but the "on-disk" size is larger.

If so may be make CephFS team aware by moving the ticket to the corresponding project for further troubleshooting...?

I've never seen this before but seriously doubt it has anything to do with CephFS. This file size is not even large enough to span two objects.

#9 Updated by Xiaoxi Chen 4 months ago

2020-08-01 07:27:49.233 7f28c3d5a700 -1 log_channel(cluster) log [ERR] : 2.717 shard 396 soid 2:e8e1dbf2:::10002def43a.00000000:head : candidate size 25925 info size 25909 mismatch
2020-08-01 07:27:49.233 7f28c3d5a700 -1 log_channel(cluster) log [ERR] : 2.717 shard 952 soid 2:e8e1dbf2:::10002def43a.00000000:head : candidate size 25925 info size 25909 mismatch
2020-08-01 07:27:49.233 7f28c3d5a700 -1 log_channel(cluster) log [ERR] : 2.717 shard 1084 soid 2:e8e1dbf2:::10002def43a.00000000:head : candidate size 25925 info size 25909 mismatch
2020-08-01 07:27:49.233 7f28c3d5a700 -1 log_channel(cluster) log [ERR] : 2.717 soid 2:e8e1dbf2:::10002def43a.00000000:head : failed to pick suitable object info
2020-08-01 07:27:49.233 7f28c3d5a700 -1 log_channel(cluster) log [ERR] : repair 2.717 2:e8e1dbf2:::10002def43a.00000000:head : on disk size (25925) does not match object info size (25909) adjusted for ondisk to (25909)
2020-08-01 03:35:03.233 7fc2d00fa700 -1 log_channel(cluster) log [ERR] : 6.1c3 shard 40 soid 6:c39b6b57:::70010d80286.00000000:head : candidate size 10303 info size 10133 mismatch
2020-08-01 03:35:03.233 7fc2d00fa700 -1 log_channel(cluster) log [ERR] : 6.1c3 shard 78 soid 6:c39b6b57:::70010d80286.00000000:head : candidate size 10303 info size 10133 mismatch
2020-08-01 03:35:03.233 7fc2d00fa700 -1 log_channel(cluster) log [ERR] : 6.1c3 shard 212 soid 6:c39b6b57:::70010d80286.00000000:head : candidate size 10303 info size 10133 mismatch
2020-08-01 03:35:03.233 7fc2d00fa700 -1 log_channel(cluster) log [ERR] : 6.1c3 soid 6:c39b6b57:::70010d80286.00000000:head : failed to pick suitable object info
2020-08-01 03:35:03.233 7fc2d00fa700 -1 log_channel(cluster) log [ERR] : repair 6.1c3 6:c39b6b57:::70010d80286.00000000:head : on disk size (10303) does not match object info size (10133) adjusted for ondisk to (10133)

I tent do believe it related somewhat cephfs, this never happen in other scenarios and most importantly, most of the victims are .bash_history(some /home directory on cephfs), so it maybe related with some consistent issue?

Also available in: Atom PDF