Project

General

Profile

Actions

Bug #65532

open

osd crashes due to invalid clone_range ops

Added by Xuehan Xu 14 days ago. Updated 9 days ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _txc_create osr 0x4d34b00 = 0x4f78900 seq 9
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1).collection(3.0_head 0x4d092c0) get_onode oid #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# key 0x7F8000000000000003005A1FEE'!scephqa02.cpp.bjat94996-4!='0x0000000000000028FFFFFFFFFFFFFFFF6F
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1).collection(3.0_head 0x4d092c0)  r 0 v.len 470
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore.OnodeSpace(0x4d09400 in 0x300000b1c800) add_onode #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# 0x4f9cb00
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - _add 0x300000b1c800 #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# added, num=73
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _rmattrs 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:28#
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _rmattrs 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# = 0
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _truncate 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# 0x3c85d3
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _do_truncate 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# 0x3c85d3
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _truncate 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# 0x3c85d3 = 0
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _setattrs 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# 2 keys
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _setattrs 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# 2 keys = 0
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1).collection(3.0_head 0x4d092c0) get_onode oid #3:005a1fee:::scephqa02.cpp.bjat94996-4:2d# key 0x7F8000000000000003005A1FEE'!scephqa02.cpp.bjat94996-4!='0x000000000000002DFFFFFFFFFFFFFFFF6F
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluefs - bluefs _read_random h 0x4d2e900 0x2c8d~a05 from file(ino 25 size 0x54bb1 mtime 2024-04-16T10:34:32.964782+0000 allocated 60000 alloc_commit 60000 extents [1:0x5b8000~60000])
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluefs - bluefs _read_random read random 0x2c8d~a05 of 1:0x5b8000~60000
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluefs - bluefs _read_random got 0xa05
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1).collection(3.0_head 0x4d092c0)  r 0 v.len 451
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore.OnodeSpace(0x4d09400 in 0x300000b1c800) add_onode #3:005a1fee:::scephqa02.cpp.bjat94996-4:2d# 0x4f9cdc0
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - _add 0x300000b1c800 #3:005a1fee:::scephqa02.cpp.bjat94996-4:2d# added, num=74
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _clone_range 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:2d# -> #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# from 0x0~3c85d3 to offset 0x0
DEBUG 2024-04-16 10:34:35,354 [shard 0:main] bluestore - bluestore(/da1/var/lib/ceph/osd/ceph-1) _clone_range 3.0_head #3:005a1fee:::scephqa02.cpp.bjat94996-4:2d# -> #3:005a1fee:::scephqa02.cpp.bjat94996-4:28# from 0x0~3c85d3 to offset 0x0 = -22
ERROR 2024-04-16 10:34:35,354 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-1) _txc_add_transaction error (22) Invalid argument not handled on operation 30 (op 3, counting from 0)
ERROR 2024-04-16 10:34:35,354 [shard 0:main] none - bluestore(/da1/var/lib/ceph/osd/ceph-1) unexpected error code
WARN  2024-04-16 10:34:35,354 [shard 0:main] bluestore - _dump_transaction transaction dump:
{
    "ops": [
        {
            "op_num": 0,
            "op_name": "rmattrs",
            "collection": "3.0_head",
            "oid": "#3:005a1fee:::scephqa02.cpp.bjat94996-4:28#" 
        },
        {
            "op_num": 1,
            "op_name": "truncate",
            "collection": "3.0_head",
            "oid": "#3:005a1fee:::scephqa02.cpp.bjat94996-4:28#",
            "offset": 3966419
        },
        {
            "op_num": 2,
            "op_name": "setattrs",
            "collection": "3.0_head",
            "oid": "#3:005a1fee:::scephqa02.cpp.bjat94996-4:28#",
            "attr_lens": {
                "_": 232,
                "__header": 80
            }
        },
        {
            "op_num": 3,
            "op_name": "clonerange2",
            "collection": "3.0_head",
            "src_oid": "#3:005a1fee:::scephqa02.cpp.bjat94996-4:2d#",
            "dst_oid": "#3:005a1fee:::scephqa02.cpp.bjat94996-4:28#",
            "src_offset": 0,
            "len": 3966419,
            "dst_offset": 0
        },
        {
            "op_num": 4,
            "op_name": "omap_rmkeys",
            "collection": "3.0_head",
            "oid": "#3:00000000:::snapmapper:0#",
            "attrs": [
                "OBJ_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4.." 
            ]
        },
        {
            "op_num": 5,
            "op_name": "omap_rmkeys",
            "collection": "3.0_head",
            "oid": "#3:00000000:::snapmapper:0#",
            "attrs": [
                "SNA_3_0000000000000021_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4..",
                "SNA_3_0000000000000026_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4..",
                "SNA_3_0000000000000027_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4..",
                "SNA_3_0000000000000028_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4.." 
            ]
        },
        {
            "op_num": 6,
            "op_name": "omap_setkeys",
            "collection": "3.0_head",
            "oid": "#3:00000000:::snapmapper:0#",
            "attr_lens": {
                "OBJ_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4..": 113
            }
        },
        {
            "op_num": 7,
            "op_name": "omap_setkeys",
            "collection": "3.0_head",
            "oid": "#3:00000000:::snapmapper:0#",
            "attr_lens": {
                "SNA_3_0000000000000021_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4..": 93,
                "SNA_3_0000000000000027_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4..": 93,
                "SNA_3_0000000000000028_0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4..": 93
            }
        },
        {
            "op_num": 8,
            "op_name": "omap_rmkeys",
            "collection": "3.0_head",
            "oid": "#3:00000000::::head#",
            "attrs": [
                "missing/0000000000000003.00A58F77.28.scephqa02%ecpp%ebjat94996-4.." 
            ]
        },
        {
            "op_num": 9,
            "op_name": "omap_setkeys",
            "collection": "3.0_head",
            "oid": "#3:00000000::::head#",
            "attr_lens": {
                "_info": 1017
            }
        },
        {
            "op_num": 10,
            "op_name": "omap_rmkeys",
            "collection": "3.0_head",
            "oid": "#3:00000000::::head#",
            "attrs": [
                "_fastinfo" 
            ]
        }
    ]
}

ERROR 2024-04-16 10:34:35,354 none - /da1/xxh/rpmbuild/BUILD/ceph-19.0.0-2545-g6d9d2613e90/src/os/bluestore/BlueStore.cc:15141 : In function 'void BlueStore::_txc_add_transaction(TransContext*, ObjectStore::Transaction*)', abort(%s)
unexpected error
Aborting.
Backtrace:
 0# gsignal in /lib64/libc.so.6
 1# abort in /lib64/libc.so.6
 2# ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) in ceph-osd
 3# BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*) in ceph-osd
 4# BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*) in ceph-osd
 5# 0x0000000001FBF74E in ceph-osd
 6# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
 7# 0x00002B1C4DD3BBA3 in /lib64/libstdc++.so.6
 8# 0x00002B1C50FC91CF in /lib64/libpthread.so.0
 9# clone in /lib64/libc.so.6
Content of /proc/self/maps:
00400000-0118f000 r--p 00000000 fd:00 5919297                            /usr/bin/ceph-osd
0118f000-03100000 r-xp 00d8f000 fd:00 5919297                            /usr/bin/ceph-osd
03100000-03f43000 r--p 02d00000 fd:00 5919297                            /usr/bin/ceph-osd
03f43000-0400e000 r--p 03b42000 fd:00 5919297                            /usr/bin/ceph-osd
0400e000-04048000 rw-p 03c0d000 fd:00 5919297                            /usr/bin/ceph-osd
04048000-04097000 rw-p 00000000 00:00 0
044dc000-0dfa0000 rw-p 00000000 00:00 0                                  [heap]
Actions #1

Updated by Xuehan Xu 14 days ago

It seems that this is due to incorrect clone_overlap calculations, will go into it.

Actions #2

Updated by Xuehan Xu 14 days ago

  • Pull request ID set to 56938
Actions #3

Updated by Matan Breizman 9 days ago

  • Status changed from New to Fix Under Review
  • Pull request ID changed from 56938 to 56606
Actions

Also available in: Atom PDF