Project

General

Profile

Actions

Bug #65735

open

OSDs failed to restart when doing crimson-osd:thrash tests

Added by Xuehan Xu 14 days ago. Updated 13 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: stray shard 0x0
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: 0x7F800000000000000359A763D5'!scephqa03.cpp.bjat.qianxin-inc.cn758234-100!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F0000000078 is unexpected
DEBUG 2024-04-30 12:29:20,156 [shard 0:main] bluefs - bluefs _read_random h 0x4fd6000 0xaa844~b0d from file(ino 25 size 0x1e11d6 mtime 2024-04-30T12:29:05.444206+0000 allocated 1f0000 alloc_commit 0 extents [1:0x332b5000~1f0000])
DEBUG 2024-04-30 12:29:20,156 [shard 0:main] bluefs - bluefs _read_random left 0x187bc 0xaa844~b0d
DEBUG 2024-04-30 12:29:20,156 [shard 0:main] bluefs - bluefs _read_random got 0xb0d
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: stray shard 0xbc000
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: 0x7F800000000000000359A763D5'!scephqa03.cpp.bjat.qianxin-inc.cn758234-100!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F000BC00078 is unexpected
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: stray shard 0x190000
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: 0x7F800000000000000359A763D5'!scephqa03.cpp.bjat.qianxin-inc.cn758234-100!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F0019000078 is unexpected
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: stray shard 0x1dc000
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: 0x7F800000000000000359A763D5'!scephqa03.cpp.bjat.qianxin-inc.cn758234-100!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F001DC00078 is unexpected
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: stray shard 0x290000
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: 0x7F800000000000000359A763D5'!scephqa03.cpp.bjat.qianxin-inc.cn758234-100!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F0029000078 is unexpected
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: stray shard 0x2dc000
ERROR 2024-04-30 12:29:20,156 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-2) fsck error: 0x7F800000000000000359A763D5'!scephqa03.cpp.bjat.qianxin-inc.cn758234-100!='0xFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFF6F002DC00078 is unexpected
Actions #1

Updated by Xuehan Xu 13 days ago

This was caused by a miss handling of CEPH_OSD_OP_CREATE in crimson, a new issue has been created: https://tracker.ceph.com/issues/65773.

On the other hand, I think maybe the bluestore shouldn't crash simply because the user of bluestore behave unexpectedly. So I think it's left for the bluestore team to decide whether this issue should be closed.

Actions

Also available in: Atom PDF