Project

General

Profile

Actions

Bug #57530

closed

crimson/seastore: crash in --mkfs with vstart

Added by Samuel Just over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

main branch, commit 2bdccfd5eab5a18a4b6a69ef7ee31916a3f1968e

/home/sam/git-checkouts/ceph2/build/bin/crimson-osd -i 0 -c /home/sam/git-checkouts/ceph2/build/ceph.conf --mkfs --key AQCzDSFj2E69EhAAz3vt4veXN6u8ACpReqGBTw== --osd-uuid 2ae27dd0-dfc7-419f-97cf-b97acbb50b9b --smp 1 --cpuset 0 --debug 
WARNING: debug mode. Not for benchmarking or production
WARN  2022-09-13 23:09:40,092 [shard 0] seastar - Creation of perf_event based stall detector failed, falling back to posix timer: std::system_error (error system:13, perf_event_open() failed: Permission denied)
INFO  2022-09-13 23:09:40,095 [shard 0] seastar - Created fair group io-queue-0, capacity rate 2147483:2147483, limit 12582912, rate 16777216 (factor 1), threshold 2000
INFO  2022-09-13 23:09:40,095 [shard 0] seastar - Created io group dev(0), length limit 4194304:4194304, rate 2147483647:2147483647
INFO  2022-09-13 23:09:40,095 [shard 0] seastar - Created io queue dev(0) capacities: 512:2000/2000 1024:3000/3000 2048:5000/5000 4096:9000/9000 8192:17000/17000 16384:33000/33000 32768:65000/65000 65536:129000/129000 131072:257000/257000
==16678==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
==16678==WARNING: ASan is ignoring requested __asan_handle_no_return: stack type: default top: 0x7fffa1cf2000; bottom 0x7efdec1fb000; size: 0x0101b5af7000 (1106854768640)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
/home/sam/git-checkouts/ceph2/src/include/denc.h:300:11: runtime error: reference binding to misaligned address 0x61000000878b for type 'device_type_t', which requires 4 byte alignment
0x61000000878b: note: pointer points here
 00  00 00 00 be be be be be  be be be be be be be be  be be be be be be be be  be be be be be be be
              ^ 
/home/sam/git-checkouts/ceph2/src/crimson/os/seastore/seastore_types.h:2047:58: runtime error: store to misaligned address 0x61000000878b for type 'device_type_t', which requires 4 byte alignment
0x61000000878b: note: pointer points here
 00  00 00 00 be be be be be  be be be be be be be be  be be be be be be be be  be be be be be be be
              ^ 
/home/sam/git-checkouts/ceph2/src/include/denc.h:294:11: runtime error: reference binding to misaligned address 0x62500002b04b for type 'const device_type_t', which requires 4 byte alignment
0x62500002b04b: note: pointer points here
 00  00 00 00 01 00 01 01 10  00 00 00 2a e2 7d d0 df  c7 41 9f 97 cf b9 7a cb  b5 0b 9b 00 00 00 00
              ^ 
/home/sam/git-checkouts/ceph2/src/crimson/os/seastore/seastore_types.h:2054:7: runtime error: load of misaligned address 0x62500002b04b for type 'const device_type_t', which requires 4 byte alignment
0x62500002b04b: note: pointer points here
 00  00 00 00 01 00 01 01 10  00 00 00 2a e2 7d d0 df  c7 41 9f 97 cf b9 7a cb  b5 0b 9b 00 00 00 00
              ^ 
/home/sam/git-checkouts/ceph2/src/crimson/os/seastore/onode_manager/staged-fltree/node.cc:1964:34: runtime error: member access within null pointer of type 'struct LeafNodeImpl'
Segmentation fault on shard 0.
Backtrace:
Reactor stalled for 129698 ms on shard 0. Backtrace: 0x53bf0 0xa77ddc5 0xa63c101 0xa63efb9 0xa63f1cc 0xa63f45c 0xa63f749 0x42abf 0x5ba0907 0x42abf 0x2cc0b 0x2cf1d 0x299db 0xbb154 0x23c1ce7 0x23c1ebb 0x2401fe5 0x240335c 0x24036d4 0x5ba76b1 0x5ba7bc4 0x5ba825a 0x5ba0154 0x5ba04bb 0x5ba095a 0x42abf 0x9c2a1d0 0x9b02939 0x9b9fb6c 0x9b9d4bf 0x9b9d6d0 0x9b9da2a 0x95ff3ac 0x964e3e3 0x964e6db 0x96525bd 0x9652794 0x9652d41 0x96532d8 0x96534c4 0x96539ae 0x9653f63 0x966b5bb 0x9671858 0x96874a8 0xa621861 0xa6557d1 0xa7012e0 0xa702d9c 0xa40b1d2 0xa40bc05 0x25a0bac 0x2d58f 0x2d648 0x238b8f4
Reactor stalled for 130174 ms on shard 0. Backtrace: 0x53bf0 0xa77ddc5 0xa63c101 0xa63efb9 0xa63f1cc 0xa63f45c 0xa63f749 0x42abf 0x5ba0907 0x42abf 0x2cc0b 0x2cf1d 0x299db 0xbb154 0x23c1ce7 0x23c1ebb 0x2401fe5 0x240335c 0x24036d4 0x5ba76b1 0x5ba7bc4 0x5ba825a 0x5ba0154 0x5ba04bb 0x5ba095a 0x42abf 0x9c2a1d0 0x9b02939 0x9b9fb6c 0x9b9d4bf 0x9b9d6d0 0x9b9da2a 0x95ff3ac 0x964e3e3 0x964e6db 0x96525bd 0x9652794 0x9652d41 0x96532d8 0x96534c4 0x96539ae 0x9653f63 0x966b5bb 0x9671858 0x96874a8 0xa621861 0xa6557d1 0xa7012e0 0xa702d9c 0xa40b1d2 0xa40bc05 0x25a0bac 0x2d58f 0x2d648 0x238b8f4
Reactor stalled for 131062 ms on shard 0. Backtrace: 0x53bf0 0xa77ddc5 0xa63c101 0xa63efb9 0xa63f1cc 0xa63f45c 0xa63f749 0x42abf 0x5ba0907 0x42abf 0x2cc0b 0x2cf1d 0x299db 0xbb154 0x23c1ce7 0x23c1ebb 0x2401fe5 0x240335c 0x24036d4 0x5ba76b1 0x5ba7bc4 0x5ba825a 0x5ba0154 0x5ba04bb 0x5ba095a 0x42abf 0x9c2a1d0 0x9b02939 0x9b9fb6c 0x9b9d4bf 0x9b9d6d0 0x9b9da2a 0x95ff3ac 0x964e3e3 0x964e6db 0x96525bd 0x9652794 0x9652d41 0x96532d8 0x96534c4 0x96539ae 0x9653f63 0x966b5bb 0x9671858 0x96874a8 0xa621861 0xa6557d1 0xa7012e0 0xa702d9c 0xa40b1d2 0xa40bc05 0x25a0bac 0x2d58f 0x2d648 0x238b8f4
Reactor stalled for 132777 ms on shard 0. Backtrace: 0x53bf0 0xa77ddc5 0xa63c101 0xa63efb9 0xa63f1cc 0xa63f45c 0xa63f749 0x42abf 0x2cc0b 0x2cf1d 0x299db 0xbb154 0x23c1ce7 0x23c1ebb 0x2401fe5 0x240335c 0x24036d4 0x5ba76b1 0x5ba7bc4 0x5ba825a 0x5ba0154 0x5ba04bb 0x5ba095a 0x42abf 0x9c2a1d0 0x9b02939 0x9b9fb6c 0x9b9d4bf 0x9b9d6d0 0x9b9da2a 0x95ff3ac 0x964e3e3 0x964e6db 0x96525bd 0x9652794 0x9652d41 0x96532d8 0x96534c4 0x96539ae 0x9653f63 0x966b5bb 0x9671858 0x96874a8 0xa621861 0xa6557d1 0xa7012e0 0xa702d9c 0xa40b1d2 0xa40bc05 0x25a0bac 0x2d58f 0x2d648 0x238b8f4
Reactor stalled for 136146 ms on shard 0. Backtrace: 0x53bf0 0xa77ddc5 0xa63c101 0xa63efb9 0xa63f1cc 0xa63f45c 0xa63f749 0x42abf 0x5ba0907 0x42abf 0x2cc0b 0x2cf1d 0x299db 0xbb154 0x23c1ce7 0x23c1ebb 0x2401fe5 0x240335c 0x24036d4 0x5ba76b1 0x5ba7bc4 0x5ba825a 0x5ba0154 0x5ba04bb 0x5ba095a 0x42abf 0x9c2a1d0 0x9b02939 0x9b9fb6c 0x9b9d4bf 0x9b9d6d0 0x9b9da2a 0x95ff3ac 0x964e3e3 0x964e6db 0x96525bd 0x9652794 0x9652d41 0x96532d8 0x96534c4 0x96539ae 0x9653f63 0x966b5bb 0x9671858 0x96874a8 0xa621861 0xa6557d1 0xa7012e0 0xa702d9c 0xa40b1d2 0xa40bc05 0x25a0bac 0x2d58f 0x2d648 0x238b8f4

Command to reproduce (reproduces every time for me):

MDS=0 MGR=1 OSD=3 MON=1 ../src/vstart.sh --without-dashboard -X --crimson --redirect-output --debug -n --no-restart --seastore

Last few log lines:

DEBUG 2022-09-13 23:09:41,280 [shard 0] seastore - SeaStore::create_new_collection: meta
DEBUG 2022-09-13 23:09:41,280 [shard 0] seastore - SeaStore::read: oid #-1:7b3f43c4:::osd_superblock:0# offset 0 len 0
DEBUG 2022-09-13 23:09:41,280 [shard 0] seastore_t - 0x616000022280 Cache::create_transaction: created name=read_obj, source=READ, is_weak=false
DEBUG 2022-09-13 23:09:41,281 [shard 0] seastore_cache - 0x616000022280 Cache::get_root: root not on t -- CachedExtent(addr=0x61a000015c80, type=ROOT, version=0, dirty_from_or_retired_at=jseq(sseq(1), paddr<Seg[Dev(0),6],32768>), modify_ti
me=tp(2022-09-13 23:09:40), paddr=paddr<Dev(ROOT),0>, length=0, state=DIRTY, last_committed_crc=0, refcount=3, user_hint=Hint(NULL), reclaim_gen=NULL_GEN)
DEBUG 2022-09-13 23:09:41,282 [shard 0] seastore_cache - Cache::get_extent: ONODE_BLOCK_STAGED paddr<Seg[Dev(0),6],8192>~16384 is absent, add extent and reading ... -- CachedExtent(addr=0x61100000d880, type=ONODE_BLOCK_STAGED, version=0, d
irty_from_or_retired_at=JOURNAL_SEQ_NULL, modify_time=tp(NULL), paddr=paddr<Seg[Dev(0),6],8192>, length=16384, state=CLEAN_PENDING, last_committed_crc=0, refcount=1, user_hint=Hint(NULL), reclaim_gen=NULL_GEN)
DEBUG 2022-09-13 23:09:41,282 [shard 0] seastore_lba - add_pin: parent has ref 0x61300000c1a0
DEBUG 2022-09-13 23:09:41,282 [shard 0] seastore_device - BlockSegmentManager::read: Seg[Dev(0),6] offset=8192~16384 poffset=402669568 ...
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_journal - SegmentedJournal::do_submit_record: H0x616000021eb0 finish with record_locator_t(block_base=paddr<Seg[Dev(0),12],8192>, write_result_t(start=jseq(sseq(2), paddr<Seg[Dev(0),12],4096
>), length=4096))
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_t - 0x616000021c80 TransactionManager::submit_transaction_direct: committed with record_locator_t(block_base=paddr<Seg[Dev(0),12],8192>, write_result_t(start=jseq(sseq(2), paddr<Seg[Dev(0),1
2],4096>), length=4096))
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_cleaner - JournalTrimmerImpl::set_journal_head: journal_head jseq(sseq(2), paddr<Seg[Dev(0),12],4096>) => jseq(sseq(2), paddr<Seg[Dev(0),12],4096>), JournalTrimmer(should_block_on_trim=0, sh
ould_(trim_dirty=1, trim_alloc=1))
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_cache - 0x616000021c80 Cache::complete_commit: commit extent done, become dirty -- CachedExtent(addr=0x61300000d940, type=BACKREF_LEAF, version=1, dirty_from_or_retired_at=jseq(sseq(2), padd
r<Seg[Dev(0),12],4096>), modify_time=tp(2022-09-13 23:09:41), paddr=paddr<Seg[Dev(0),0],12288>, length=4096, state=DIRTY, last_committed_crc=977753860, refcount=2, user_hint=Hint(REWRITE), reclaim_gen=GEN(1), size=1, meta=btree_node_meta_t
(begin=paddr<MIN>, end=paddr<NULL>, depth=1))
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_cache - 0x616000021c80 Cache::complete_commit: total existing blocks num: 0, exist clean num: 0, exist mutation pending num: 0
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_lba - 0x616000021c80 BtreeLBAManager::complete_transaction: start
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_backref - 0x616000021c80 BtreeBackrefManager::complete_transaction: start
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_cache - Cache::get_oldest_backref_dirty_from: backref_oldest: jseq(sseq(1), paddr<Seg[Dev(0),6],4096>)
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_cache - Cache::get_oldest_dirty_from: dirty_oldest: jseq(sseq(1), paddr<Seg[Dev(0),6],4096>)
INFO  2022-09-13 23:09:41,289 [shard 0] seastore_cleaner - JournalTrimmerImpl::update_journal_tails: journal_dirty_tail JOURNAL_SEQ_MIN => jseq(sseq(1), paddr<Seg[Dev(0),6],4096>), JournalTrimmer(should_block_on_trim=0, should_(trim_dirty=
0, trim_alloc=1))
INFO  2022-09-13 23:09:41,289 [shard 0] seastore_cleaner - JournalTrimmerImpl::update_journal_tails: journal_alloc_tail JOURNAL_SEQ_MIN => jseq(sseq(1), paddr<Seg[Dev(0),6],4096>), JournalTrimmer(should_block_on_trim=0, should_(trim_dirty=
0, trim_alloc=0))
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_cache - Cache::read_extent: read extent done -- CachedExtent(addr=0x61100000d880, type=ONODE_BLOCK_STAGED, version=0, dirty_from_or_retired_at=JOURNAL_SEQ_NULL, modify_time=tp(NULL), paddr=p
addr<Seg[Dev(0),6],8192>, length=16384, state=CLEAN, last_committed_crc=2983097757, refcount=3, user_hint=Hint(NULL), reclaim_gen=NULL_GEN, laddr=0, pin=LBAPin(0~16384->paddr<Seg[Dev(0),6],8192>, fltree_header=headerL0(is_level_tail=1, level=0))
DEBUG 2022-09-13 23:09:41,289 [shard 0] seastore_cleaner - JournalTrimmerImpl::trim_alloc: finish, alloc_tail=jseq(sseq(1), paddr<Seg[Dev(0),6],4096>)

Backtrace:

0x0000000000053bf0
??:0
0x000000000a77ddc5
/home/sam/git-checkouts/ceph2/src/seastar/include/seastar/util/backtrace.hh:59
0x000000000a63c101
/home/sam/git-checkouts/ceph2/src/seastar/src/core/reactor.cc:772
0x000000000a63efb9
/home/sam/git-checkouts/ceph2/src/seastar/src/core/reactor.cc:1366
0x000000000a63f1cc
/home/sam/git-checkouts/ceph2/src/seastar/src/core/reactor.cc:1108
0x000000000a63f45c
/home/sam/git-checkouts/ceph2/src/seastar/src/core/reactor.cc:1125
0x000000000a63f749
/home/sam/git-checkouts/ceph2/src/seastar/src/core/reactor.cc:1349
0x0000000000042abf
??:0
0x0000000005ba0907
/home/sam/git-checkouts/ceph2/src/crimson/common/fatal_signal.cc:172
0x0000000000042abf
??:0
0x000000000002cc0b
??:0
0x000000000002cf1d
??:0
0x00000000000299db
??:0
0x00000000000bb154
??:0
0x00000000023c1ce7
/usr/include/c++/12/bits/new_allocator.h:137
0x00000000023c1ebb
/usr/include/c++/12/bits/allocator.h:188
0x0000000002401fe5
/usr/include/c++/12/bits/basic_string.tcc:328
0x000000000240335c
/usr/include/c++/12/bits/basic_string.tcc:420
0x00000000024036d4
/usr/include/c++/12/bits/basic_string.h:1422
0x0000000005ba76b1
/usr/include/c++/12/bits/basic_string.h:1388 (discriminator 2)
0x0000000005ba7bc4
/home/sam/git-checkouts/ceph2/build/boost/include/boost/stacktrace/stacktrace.hpp:404
0x0000000005ba825a
/home/sam/git-checkouts/ceph2/build/boost/include/boost/stacktrace/stacktrace.hpp:410
0x0000000005ba0154
/home/sam/git-checkouts/ceph2/src/crimson/common/fatal_signal.cc:95
0x0000000005ba04bb
/home/sam/git-checkouts/ceph2/src/crimson/common/fatal_signal.cc:161
0x0000000005ba095a
/home/sam/git-checkouts/ceph2/src/crimson/common/fatal_signal.cc:61
0x0000000000042abf
??:0
0x0000000009c2a1d0
/home/sam/git-checkouts/ceph2/src/crimson/os/seastore/onode_manager/staged-fltree/node.cc:1964
0x0000000009b02939
/home/sam/git-checkouts/ceph2/src/crimson/os/seastore/onode_manager/staged-fltree/node.cc:317
0x0000000009b9fb6c
/usr/include/c++/12/bits/invoke.h:61
0x0000000009b9d4bf
/home/sam/git-checkouts/ceph2/src/crimson/common/interruptible_future.h:1525
0x0000000009b9d6d0
/home/sam/git-checkouts/ceph2/src/seastar/include/seastar/core/future.hh:2172
0x0000000009b9da2a
/home/sam/git-checkouts/ceph2/src/seastar/include/seastar/core/do_with.hh:131
0x00000000095ff3ac
/home/sam/git-checkouts/ceph2/src/crimson/os/seastore/onode_manager/staged-fltree/tree.h:204
0x000000000964e3e3
/home/sam/git-checkouts/ceph2/src/crimson/common/interruptible_future.h:1525
0x000000000964e6db
/home/sam/git-checkouts/ceph2/src/seastar/include/seastar/core/future.hh:2172
0x00000000096525bd
/home/sam/git-checkouts/ceph2/src/crimson/common/interruptible_future.h:188
0x0000000009652794
/home/sam/git-checkouts/ceph2/src/crimson/common/interruptible_future.h:253
...
Actions #1

Updated by Samuel Just over 1 year ago

  • Description updated (diff)
Actions #3

Updated by Samuel Just over 1 year ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF