Project

General

Profile

Actions

Bug #64562

open

Occasional segmentation faults in ScrubQueue::collect_ripe_jobs

Added by Pablo Higueras 2 months ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Category:
Scrub/Repair
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello!

Igor Fedotov suggested me to open a new ticket under RADOS subproject, the original ticket is [[https://tracker.ceph.com/issues/64444]]

For over a year now we are suffering from high start-up times in our OSDs (more than an hour).
This is something that does not always happen, but it is frequent.
Here is a trace of the log file when this happens:

2024-02-12T10:32:22.378+0000 7f3f0e837540  1 freelist init
2024-02-12T10:32:22.378+0000 7f3f0e837540  1 freelist _read_cfg
2024-02-12T10:32:22.498+0000 7f3f0e837540 -1 bluestore::NCB::__restore_allocator::No Valid allocation info on disk (empty file)
2024-02-12T10:32:22.498+0000 7f3f0e837540  0 bluestore(/var/lib/ceph/osd/ceph-54) _init_alloc::NCB::restore_allocator() failed! Run Full Recovery from ONodes (might take a while) ...
2024-02-12T11:55:42.607+0000 7f3f0e837540  1 HybridAllocator _spillover_range constructing fallback allocator
2024-02-12T11:56:00.555+0000 7f3f0e837540  1 bluestore::NCB::read_allocation_from_drive_on_startup::::Allocation Recovery was completed in 5018.058711 seconds, extent_count=585636576
2024-02-12T11:56:00.555+0000 7f3f0e837540  1 bluestore(/var/lib/ceph/osd/ceph-54) _init_alloc loaded 0 B in 0 extents, allocator type hybrid, capacity 0x6fc7c000000, block size 0x1000, free 0x30e6e1f6000, fragmentation 0.45
2024-02-12T11:56:00.559+0000 7f3f0e837540  4 rocksdb: [db/db_impl/db_impl.cc:446] Shutdown: canceling all background work
2024-02-12T11:56:00.807+0000 7f3f0e837540  4 rocksdb: [db/db_impl/db_impl.cc:625] Shutdown complete
2024-02-12T11:56:00.895+0000 7f3f0e837540  1 bluefs umount
2024-02-12T11:56:00.911+0000 7f3f0e837540  1 bdev(0x55f93a1b9800 /var/lib/ceph/osd/ceph-54/block) close
2024-02-12T11:56:01.131+0000 7f3f0e837540  1 bdev(0x55f93a1b9800 /var/lib/ceph/osd/ceph-54/block) open path /var/lib/ceph/osd/ceph-54/block
2024-02-12T11:56:01.135+0000 7f3f0e837540  1 bdev(0x55f93a1b9800 /var/lib/ceph/osd/ceph-54/block) open size 7681481900032 (0x6fc7c000000, 7.0 TiB) block_size 4096 (4 KiB) non-rotational device, discard supported
2024-02-12T11:56:01.135+0000 7f3f0e837540  1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-54/block size 7.0 TiB
2024-02-12T11:56:01.135+0000 7f3f0e837540  1 bluefs mount
2024-02-12T11:56:01.135+0000 7f3f0e837540  1 bluefs _init_alloc shared, id 1, capacity 0x6fc7c000000, block size 0x10000
2024-02-12T11:56:04.311+0000 7f3f0e837540  1 bluefs mount shared_bdev_used = 1407900450816
2024-02-12T11:56:04.311+0000 7f3f0e837540  1 bluestore(/var/lib/ceph/osd/ceph-54) _prepare_db_environment set db_paths to db,7297407805030 db.slow,7297407805030
2024-02-12T11:56:04.427+0000 7f3f0e837540  4 rocksdb: RocksDB version: 6.15.5

We see this issue after an OSD crash. Here are the logs of the crash:

** Compaction Stats [P] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   0.0      0.0     0.0      0.0       6.1      6.1       0.0   1.0      0.0     76.2     81.96             77.32       288    0.285       0      0
  L1      2/0   87.51 MB   0.3     28.7     6.1     22.6      22.6     -0.0       0.0   3.7    149.5    117.7    196.25            127.45       263    0.746    145M    72M
 Sum      2/0   87.51 MB   0.0     28.7     6.1     22.6      28.7      6.1       0.0   4.7    105.5    105.5 
2024-02-12T10:31:40.566+0000 7f042dd91700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f042dd91700 thread_name:safe_timer

 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
 1: /lib64/libpthread.so.0(+0x12cf0) [0x7f043b1b4cf0]
 2: /usr/bin/ceph-osd(+0x90f29b) [0x55b23d30029b]
 3: /usr/bin/ceph-osd(+0x90f346) [0x55b23d300346]
 4: (ScrubQueue::collect_ripe_jobs(std::vector<boost::intrusive_ptr<ScrubQueue::ScrubJob>, std::allocator<boost::intrusive_ptr<ScrubQueue::ScrubJob> > >&, utime_t)+0x193) [0x55b23d301e03]
 5: (ScrubQueue::select_pg_and_scrub(Scrub::ScrubPreconds&)+0x11d) [0x55b23d305a5d]
 6: (OSD::sched_scrub()+0x21f) [0x55b23d08669f]
 7: (OSD::tick_without_osd_lock()+0x578) [0x55b23d0bb928]
 8: (Context::complete(int)+0xd) [0x55b23d0d6ebd]
 9: (CommonSafeTimer<std::mutex>::timer_thread()+0x12f) [0x55b23d80553f]
 10: (CommonSafeTimerThread<std::mutex>::entry()+0x11) [0x55b23d806631]
 11: /lib64/libpthread.so.0(+0x81ca) [0x7f043b1aa1ca]
 12: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
 -9999> 2024-02-12T10:31:30.034+0000 7f0418d67700  1 get compressor zstd = 0x55b29e5a80c0
 -9998> 2024-02-12T10:31:30.034+0000 7f0418d67700  1 get compressor zstd = 0x55b29e5a80c0
 -9997> 2024-02-12T10:31:30.034+0000 7f0419d69700  2 osd.54 pg_epoch: 82022 pg[6.349( v 82022'155550050 (81531'155544902,82022'155550050] local-lis/les=81363/81364 n=10342068 ec=43333/180 lis/c=81363/81363 les/c/f=81364/81364/79096 sis=81363) [54,43,14] r=0 lpr=81363 crt=82022'155550050 lcod 82022'155550049 mlcod 82022'155550049 active+clean+scrubbing [ 6.349:  ] ] scrubber <WaitReplicas/>: scrub_compare_maps replica 14 has 25 items
 -9996> 2024-02-12T10:31:30.034+0000 7f0419d69700  2 osd.54 pg_epoch: 82022 pg[6.349( v 82022'155550050 (81531'155544902,82022'155550050] local-lis/les=81363/81364 n=10342068 ec=43333/180 lis/c=81363/81363 les/c/f=81364/81364/79096 sis=81363) [54,43,14] r=0 lpr=81363 crt=82022'155550050 lcod 82022'155550049 mlcod 82022'155550049 active+clean+scrubbing [ 6.349:  ] ] scrubber <WaitReplicas/>: scrub_compare_maps replica 43 has 25 items

There is not much information about this problem, we have been investigating this issue [[https://tracker.ceph.com/issues/48276]], but it does not seem to help us.
Those crashes happens sporadically and we don't know the root cause of them. Maybe you can guide us on how to track the problem, we are looking for any kind of help.

Thank you!

Actions #1

Updated by Igor Fedotov 2 months ago

Hi Paolo,
mind me renaming the ticket to something like "Occasional segmentation faults in ScrubQueue::collect_ripe_jobs"?
Just to have the clearer visibility on the actual issue.

Actions #2

Updated by Pablo Higueras 2 months ago

Igor Fedotov wrote:

Hi Paolo,
mind me renaming the ticket to something like "Occasional segmentation faults in ScrubQueue::collect_ripe_jobs"?
Just to have the clearer visibility on the actual issue.

Hi Igor, go ahead and change the title, no problem.

Actions #3

Updated by Pablo Higueras 2 months ago

No problem with the rename Igor, thank you!

Igor Fedotov wrote:

Hi Paolo,
mind me renaming the ticket to something like "Occasional segmentation faults in ScrubQueue::collect_ripe_jobs"?
Just to have the clearer visibility on the actual issue.

Actions #4

Updated by Pablo Higueras 2 months ago

.

Actions #5

Updated by Igor Fedotov 2 months ago

  • Subject changed from No Valid allocation info on disk (empty file) to Occasional segmentation faults in ScrubQueue::collect_ripe_jobs
Actions #6

Updated by Radoslaw Zarzynski 2 months ago

  • Assignee set to Ronen Friedman
Actions

Also available in: Atom PDF