Project

General

Profile

Actions

Bug #23595

closed

osd: recovery/backfill is extremely slow

Added by Niklas Hambuechen about 6 years ago. Updated about 6 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I made a Ceph 12.2.4 (luminous stable) cluster of 3 machines with 10-Gigabit networking on Ubuntu 16.04, using pretty much only default settings.

I put CephFS on it, filling it with 6 large files (1 GB each) and 270k empty files (just `touch`ed).

I removed one OSD, wiped its data, and created a new one and added it backto the cluster.

After doing so, I observed that the recovery speed is extremely slow.

In particular, I observe quite precisely 10 objects being recovered per second:

2018-04-08 19:48:41.871692 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 258294/830889 objects degraded (31.086%), 133 pgs degraded, 133 pgs undersized (PG_DEGRADED)
2018-04-08 19:48:46.872108 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 258247/830889 objects degraded (31.081%), 133 pgs degraded, 133 pgs undersized (PG_DEGRADED)
2018-04-08 19:48:51.872489 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 258200/830889 objects degraded (31.075%), 133 pgs degraded, 133 pgs undersized (PG_DEGRADED)

At this speed, my recovery will take years.

What is going on here, why does ceph recover so slowly?


Files

Actions #1

Updated by Niklas Hambuechen about 6 years ago

I have read https://www.spinics.net/lists/ceph-devel/msg38331.html which suggests that there is some throttling going on but it was never discovered what exactly the problem is or how to fix it.

As in this thread, my machines are very idle (CPU/network/SSDs).

In strace I can observe that there are large time gaps between the `pread64` syscalls that read the data for the rebalance from one of the source nodes (newlines inserted by me to make the time jumps more obvious):

root@ceph2 ~ # strace -fyp $(pidof ceph-osd) -e pread64 -ttt
strace: Process 4945 attached with 54 threads
[pid  5004] 1523210918.877879 pread64(24</dev/loop0p2>, "\0\0\0\0%\0\0\0\0\0\0\0\0\0\0\0\310<\213Z\211e~!\2\2\25\0\0\0\0\0"..., 8192, 7187480576) = 8192
... 60 more lines ...
[pid  5004] 1523210918.931971 pread64(24</dev/loop0p2>, "\0\0\0\0\0\0\0\0\7\0\0\0_layout\36\0\0\0\2\2\30\0\0\0\0\0@"..., 8192, 7188688896) = 8192
[pid  5004] 1523210918.932128 pread64(24</dev/loop0p2>, "X\0\0\0\f+\3\0\0\1\0\0\2\0\0\0\2\2\32\0\0\0\1\0\0\0\0\1\0\0\6\0"..., 8192, 7188692992) = 8192
[pid  5004] 1523210918.932289 pread64(24</dev/loop0p2>, "bigdir\256\235\7\0\0\0\0\0\6\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0sn"..., 8192, 7188697088) = 8192

[pid  5004] 1523210923.314165 pread64(24</dev/loop0p2>, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 7108546560) = 8192
[pid  5004] 1523210923.315633 pread64(24</dev/loop0p2>, "\330\16-\1\0\0\0\0\0\0\0\3\2(\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 7108550656) = 8192
... 60 more lines ...
[pid  5004] 1523210923.329968 pread64(24</dev/loop0p2>, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\0028\0\0\0\0\0\0\0"..., 8192, 7108804608) = 8192
[pid  5004] 1523210923.330167 pread64(24</dev/loop0p2>, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 7108808704) = 8192

[pid  5004] 1523210924.313469 pread64(24</dev/loop0p2>, "\0\0\0\0\0\0\377\377\377\377\377\377\377\377\0\0\0\0\1\1\20\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 7080423424) = 8192
[pid  5004] 1523210924.314639 pread64(24</dev/loop0p2>, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 7080427520) = 8192
... 1600 more lines ...
[pid  5004] 1523210924.748849 pread64(24</dev/loop0p2>, "\0>\224\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0T\201\0\0\0\0\0"..., 8192, 7078330368) = 8192
[pid  5004] 1523210924.749169 pread64(24</dev/loop0p2>, "\0\1\0\0\0\0\0\0\0\0\0\0\0\0\2\2\30\0\0\0\0\0@\0\1\0\0\0\0\0@\0"..., 8192, 7078334464) = 8192

[pid  5005] 1523210926.867356 pread64(24</dev/loop0p2>, "\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0_layout"..., 8192, 6902288384) = 8192
[pid  5005] 1523210926.868117 pread64(24</dev/loop0p2>, "\0\7\0\0\0_parent^\0\0\0\5\4X\0\0\0\322\303\2\0\0\1\0\0\2\0"..., 8192, 6902292480) = 8192
... 300 more lines ...
[pid  5005] 1523210926.921343 pread64(24</dev/loop0p2>, "\0\0N\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\2\2\25\0\0\0\2\0\0\0\0\0\0\0"..., 8192, 6903517184) = 8192
[pid  5005] 1523210926.921496 pread64(24</dev/loop0p2>, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 6903521280) = 8192

[pid  5005] 1523210929.700298 pread64(24</dev/loop0p2>, "roF\17\0\0\0\0\0\6\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0snapset"..., 8192, 7128502272) = 8192
[pid  5005] 1523210929.701314 pread64(24</dev/loop0p2>, "\0\0\0\0\0\0\0\0\0\0\2\0\2\0\0\0\2\0\r2\206\4\207\205!1000003"..., 8192, 7128506368) = 8192
... 300 more lines ...
[pid  5005] 1523210929.755438 pread64(24</dev/loop0p2>, "\0\0004\0\0\0\355@\213Z\354\216\331\"\377\377\377\377\377\377\377\377\0\0\0\0\0\0\0\0\0\0"..., 8192, 7129739264) = 8192
[pid  5005] 1523210929.755565 pread64(24</dev/loop0p2>, "@\0\1\0\0\0\0\0@\0\6\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0_paren"..., 8192, 7129743360) = 8192

In between there's lots of `futex()` going on.

GDB stack traces:

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7f52594dfe00 (LWP 7659) "ceph-osd" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  2    Thread 0x7f5255683700 (LWP 7663) "log" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  3    Thread 0x7f5254642700 (LWP 7664) "msgr-worker-0" 0x00007f5256a2ea13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
  4    Thread 0x7f5253e41700 (LWP 7665) "msgr-worker-1" 0x00007f5256a2ea13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
  5    Thread 0x7f5253640700 (LWP 7666) "msgr-worker-2" 0x00007f5256a2ea13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
  6    Thread 0x7f52526a1700 (LWP 7667) "service" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  7    Thread 0x7f5251ea0700 (LWP 7668) "admin_socket" 0x00007f5256a2274d in poll () at ../sysdeps/unix/syscall-template.S:84
  8    Thread 0x7f5250e52700 (LWP 7673) "ceph-osd" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  9    Thread 0x7f5250651700 (LWP 7674) "safe_timer" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  10   Thread 0x7f524fe50700 (LWP 7675) "safe_timer" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  11   Thread 0x7f524f64f700 (LWP 7676) "safe_timer" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  12   Thread 0x7f524ee4e700 (LWP 7677) "safe_timer" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  13   Thread 0x7f524e64d700 (LWP 7678) "bstore_aio" 0x00007f5258a4164a in ?? () from target:/lib/x86_64-linux-gnu/libaio.so.1
  14   Thread 0x7f524de4c700 (LWP 7679) "bstore_aio" 0x00007f5258a4164a in ?? () from target:/lib/x86_64-linux-gnu/libaio.so.1
  15   Thread 0x7f5245e3c700 (LWP 7697) "dfin" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  16   Thread 0x7f524663d700 (LWP 7698) "finisher" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  17   Thread 0x7f5246e3e700 (LWP 7699) "bstore_kv_sync" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  18   Thread 0x7f524763f700 (LWP 7700) "bstore_kv_final" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  19   Thread 0x7f524d64b700 (LWP 7701) "bstore_mempool" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  20   Thread 0x7f524aec0700 (LWP 7702) "ms_dispatch" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  21   Thread 0x7f524a6bf700 (LWP 7703) "ms_local" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  22   Thread 0x7f5249ebe700 (LWP 7704) "ms_dispatch" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  23   Thread 0x7f52496bd700 (LWP 7705) "ms_local" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  24   Thread 0x7f5248ebc700 (LWP 7706) "ms_dispatch" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  25   Thread 0x7f52486bb700 (LWP 7707) "ms_local" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  26   Thread 0x7f5247eba700 (LWP 7708) "ms_dispatch" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  27   Thread 0x7f524563b700 (LWP 7709) "ms_local" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  28   Thread 0x7f5244e3a700 (LWP 7710) "ms_dispatch" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  29   Thread 0x7f5244639700 (LWP 7711) "ms_local" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  30   Thread 0x7f5243e38700 (LWP 7712) "ms_dispatch" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  31   Thread 0x7f5243637700 (LWP 7713) "ms_local" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  32   Thread 0x7f5242e36700 (LWP 7714) "ms_dispatch" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  33   Thread 0x7f5242635700 (LWP 7715) "ms_local" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  34   Thread 0x7f5241e34700 (LWP 7716) "safe_timer" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  35   Thread 0x7f5241633700 (LWP 7717) "fn_anonymous" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  36   Thread 0x7f5240e32700 (LWP 7718) "safe_timer" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  37   Thread 0x7f5240631700 (LWP 7719) "tp_peering" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  38   Thread 0x7f523fe30700 (LWP 7720) "tp_peering" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  39   Thread 0x7f523f62f700 (LWP 7721) "tp_osd_tp" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  40   Thread 0x7f523ee2e700 (LWP 7722) "tp_osd_tp" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  41   Thread 0x7f523e62d700 (LWP 7723) "tp_osd_tp" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  42   Thread 0x7f523de2c700 (LWP 7724) "tp_osd_tp" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  43   Thread 0x7f523d62b700 (LWP 7725) "tp_osd_tp" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  44   Thread 0x7f523ce2a700 (LWP 7726) "tp_osd_disk" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  45   Thread 0x7f523c629700 (LWP 7727) "tp_osd_cmd" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  46   Thread 0x7f523be28700 (LWP 7728) "osd_srv_heartbt" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  47   Thread 0x7f523b627700 (LWP 7729) "fn_anonymous" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  48   Thread 0x7f523ae26700 (LWP 7730) "fn_anonymous" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  49   Thread 0x7f523a625700 (LWP 7731) "safe_timer" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
  50   Thread 0x7f5239e24700 (LWP 7732) "safe_timer" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  51   Thread 0x7f5239623700 (LWP 7733) "safe_timer" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  52   Thread 0x7f5238e22700 (LWP 7734) "safe_timer" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  53   Thread 0x7f5238621700 (LWP 7735) "osd_srv_agent" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  54   Thread 0x7f5237e20700 (LWP 7736) "signal_handler" 0x00007f5256a2274d in poll () at ../sysdeps/unix/syscall-template.S:84
(gdb) thread apply all bt

Thread 54 (Thread 0x7f5237e20700 (LWP 7736)):
#0  0x00007f5256a2274d in poll () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f5256a400be in __poll_chk (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>, fdslen=<optimized out>) at poll_chk.c:27
#2  0x0000561eb39b9631 in poll (__timeout=-1, __nfds=4, __fds=0x7f5237e1e5f0) at /usr/include/x86_64-linux-gnu/bits/poll2.h:41
#3  SignalHandler::entry (this=0x561ed1788840) at /build/ceph-12.2.4/src/global/signal_handler.cc:281
#4  0x00007f52579b76ba in start_thread (arg=0x7f5237e20700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 53 (Thread 0x7f5238621700 (LWP 7735)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3442a52 in Cond::Wait (this=0x561ebed5c600, mutex=...) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  0x0000561eb33c6d42 in OSDService::agent_entry (this=0x561ebed5bdc8) at /build/ceph-12.2.4/src/osd/OSD.cc:629
#3  0x0000561eb3449e0d in OSDService::AgentThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/osd/OSD.h:665
#4  0x00007f52579b76ba in start_thread (arg=0x7f5238621700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 52 (Thread 0x7f5238e22700 (LWP 7734)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39f7c55 in Cond::Wait (mutex=..., this=0x561ebed5d460) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  SafeTimer::timer_thread (this=0x561ebed5d450) at /build/ceph-12.2.4/src/common/Timer.cc:108
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f5238e22700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 51 (Thread 0x7f5239623700 (LWP 7733)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39f7c55 in Cond::Wait (mutex=..., this=0x561ebed5d328) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  SafeTimer::timer_thread (this=0x561ebed5d318) at /build/ceph-12.2.4/src/common/Timer.cc:108
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f5239623700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 50 (Thread 0x7f5239e24700 (LWP 7732)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39f7c55 in Cond::Wait (mutex=..., this=0x561ebed5c778) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  SafeTimer::timer_thread (this=0x561ebed5c768) at /build/ceph-12.2.4/src/common/Timer.cc:108
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f5239e24700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 49 (Thread 0x7f523a625700 (LWP 7731)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb39f801f in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5ca78) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  SafeTimer::timer_thread (this=0x561ebed5ca68) at /build/ceph-12.2.4/src/common/Timer.cc:110
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
---Type <return> to continue, or q <return> to quit---
#4  0x00007f52579b76ba in start_thread (arg=0x7f523a625700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 48 (Thread 0x7f523ae26700 (LWP 7730)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39fa479 in Cond::Wait (mutex=..., this=0x561ebed5c8e8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  Finisher::finisher_thread_entry (this=0x561ebed5c870) at /build/ceph-12.2.4/src/common/Finisher.cc:101
#3  0x00007f52579b76ba in start_thread (arg=0x7f523ae26700) at pthread_create.c:333
#4  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 47 (Thread 0x7f523b627700 (LWP 7729)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39fa479 in Cond::Wait (mutex=..., this=0x561ebed5ce38) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  Finisher::finisher_thread_entry (this=0x561ebed5cdc0) at /build/ceph-12.2.4/src/common/Finisher.cc:101
#3  0x00007f52579b76ba in start_thread (arg=0x7f523b627700) at pthread_create.c:333
#4  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 46 (Thread 0x7f523be28700 (LWP 7728)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb33e8b98 in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5b340) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebed5b340) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  OSD::heartbeat_entry (this=0x561ebed5a000) at /build/ceph-12.2.4/src/osd/OSD.cc:5087
#4  0x0000561eb3467e5d in OSD::T_Heartbeat::entry (this=<optimized out>) at /build/ceph-12.2.4/src/osd/OSD.h:1536
#5  0x00007f52579b76ba in start_thread (arg=0x7f523be28700) at pthread_create.c:333
#6  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 45 (Thread 0x7f523c629700 (LWP 7727)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb3a021db in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5b050) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebed5b050) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  ThreadPool::worker (this=0x561ebed5af70, wt=0x561ed10fb1d0) at /build/ceph-12.2.4/src/common/WorkQueue.cc:143
#4  0x0000561eb3a03880 in ThreadPool::WorkThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:448
#5  0x00007f52579b76ba in start_thread (arg=0x7f523c629700) at pthread_create.c:333
#6  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 44 (Thread 0x7f523ce2a700 (LWP 7726)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb3a021db in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5ae38) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebed5ae38) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  ThreadPool::worker (this=0x561ebed5ad58, wt=0x561ed10fb170) at /build/ceph-12.2.4/src/common/WorkQueue.cc:143
#4  0x0000561eb3a03880 in ThreadPool::WorkThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:448
#5  0x00007f52579b76ba in start_thread (arg=0x7f523ce2a700) at pthread_create.c:333
#6  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 43 (Thread 0x7f523d62b700 (LWP 7725)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
---Type <return> to continue, or q <return> to quit---
#1  0x0000561eb341e9ac in Cond::WaitUntil (when=..., mutex=..., this=0x561ebe8e9570) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebe8e9570) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  OSD::ShardedOpWQ::_process (this=0x561ebed5b620, thread_index=<optimized out>, hb=0x561ed17ac1e0) at /build/ceph-12.2.4/src/osd/OSD.cc:10370
#4  0x0000561eb3a00664 in ShardedThreadPool::shardedthreadpool_worker (this=0x561ebed5abd0, thread_index=4) at /build/ceph-12.2.4/src/common/WorkQueue.cc:339
#5  0x0000561eb3a036a0 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:689
#6  0x00007f52579b76ba in start_thread (arg=0x7f523d62b700) at pthread_create.c:333
#7  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 42 (Thread 0x7f523de2c700 (LWP 7724)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb341e9ac in Cond::WaitUntil (when=..., mutex=..., this=0x561ebe8e93f0) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebe8e93f0) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  OSD::ShardedOpWQ::_process (this=0x561ebed5b620, thread_index=<optimized out>, hb=0x561ed17ac190) at /build/ceph-12.2.4/src/osd/OSD.cc:10370
#4  0x0000561eb3a00664 in ShardedThreadPool::shardedthreadpool_worker (this=0x561ebed5abd0, thread_index=3) at /build/ceph-12.2.4/src/common/WorkQueue.cc:339
#5  0x0000561eb3a036a0 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:689
#6  0x00007f52579b76ba in start_thread (arg=0x7f523de2c700) at pthread_create.c:333
#7  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 41 (Thread 0x7f523e62d700 (LWP 7723)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb341e9ac in Cond::WaitUntil (when=..., mutex=..., this=0x561ebe8e9270) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebe8e9270) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  OSD::ShardedOpWQ::_process (this=0x561ebed5b620, thread_index=<optimized out>, hb=0x561ed17ac140) at /build/ceph-12.2.4/src/osd/OSD.cc:10370
#4  0x0000561eb3a00664 in ShardedThreadPool::shardedthreadpool_worker (this=0x561ebed5abd0, thread_index=2) at /build/ceph-12.2.4/src/common/WorkQueue.cc:339
#5  0x0000561eb3a036a0 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:689
#6  0x00007f52579b76ba in start_thread (arg=0x7f523e62d700) at pthread_create.c:333
#7  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 40 (Thread 0x7f523ee2e700 (LWP 7722)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb341e9ac in Cond::WaitUntil (when=..., mutex=..., this=0x561ebe8e90f0) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebe8e90f0) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  OSD::ShardedOpWQ::_process (this=0x561ebed5b620, thread_index=<optimized out>, hb=0x561ed17ac0f0) at /build/ceph-12.2.4/src/osd/OSD.cc:10370
#4  0x0000561eb3a00664 in ShardedThreadPool::shardedthreadpool_worker (this=0x561ebed5abd0, thread_index=1) at /build/ceph-12.2.4/src/common/WorkQueue.cc:339
#5  0x0000561eb3a036a0 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:689
#6  0x00007f52579b76ba in start_thread (arg=0x7f523ee2e700) at pthread_create.c:333
#7  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 39 (Thread 0x7f523f62f700 (LWP 7721)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb341e9ac in Cond::WaitUntil (when=..., mutex=..., this=0x561ebe8e8f70) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebe8e8f70) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  OSD::ShardedOpWQ::_process (this=0x561ebed5b620, thread_index=<optimized out>, hb=0x561ed17ac0a0) at /build/ceph-12.2.4/src/osd/OSD.cc:10370
#4  0x0000561eb3a00664 in ShardedThreadPool::shardedthreadpool_worker (this=0x561ebed5abd0, thread_index=0) at /build/ceph-12.2.4/src/common/WorkQueue.cc:339
#5  0x0000561eb3a036a0 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:689
#6  0x00007f52579b76ba in start_thread (arg=0x7f523f62f700) at pthread_create.c:333
---Type <return> to continue, or q <return> to quit---
#7  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 38 (Thread 0x7f523fe30700 (LWP 7720)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb3a021db in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5aa98) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebed5aa98) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  ThreadPool::worker (this=0x561ebed5a9b8, wt=0x561ed10fb110) at /build/ceph-12.2.4/src/common/WorkQueue.cc:143
#4  0x0000561eb3a03880 in ThreadPool::WorkThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:448
#5  0x00007f52579b76ba in start_thread (arg=0x7f523fe30700) at pthread_create.c:333
#6  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 37 (Thread 0x7f5240631700 (LWP 7719)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb3a021db in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5aa98) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebed5aa98) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  ThreadPool::worker (this=0x561ebed5a9b8, wt=0x561ed10fb0e0) at /build/ceph-12.2.4/src/common/WorkQueue.cc:143
#4  0x0000561eb3a03880 in ThreadPool::WorkThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/WorkQueue.h:448
#5  0x00007f52579b76ba in start_thread (arg=0x7f5240631700) at pthread_create.c:333
#6  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 36 (Thread 0x7f5240e32700 (LWP 7718)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb39f801f in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5a480) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  SafeTimer::timer_thread (this=0x561ebed5a470) at /build/ceph-12.2.4/src/common/Timer.cc:110
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f5240e32700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 35 (Thread 0x7f5241633700 (LWP 7717)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39fa479 in Cond::Wait (mutex=..., this=0x7fffd64a9d70) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  Finisher::finisher_thread_entry (this=0x7fffd64a9cf8) at /build/ceph-12.2.4/src/common/Finisher.cc:101
#3  0x00007f52579b76ba in start_thread (arg=0x7f5241633700) at pthread_create.c:333
#4  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 34 (Thread 0x7f5241e34700 (LWP 7716)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb39f801f in Cond::WaitUntil (when=..., mutex=..., this=0x7fffd64a9c40) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  SafeTimer::timer_thread (this=0x7fffd64a9c30) at /build/ceph-12.2.4/src/common/Timer.cc:110
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f5241e34700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 33 (Thread 0x7f5242635700 (LWP 7715)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc5ef3 in Cond::Wait (mutex=..., this=0x561ebebad3c8) at /build/ceph-12.2.4/src/common/Cond.h:48
---Type <return> to continue, or q <return> to quit---
#2  DispatchQueue::run_local_delivery (this=0x561ebebad180) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:113
#3  0x0000561eb3a8e4cd in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:115
#4  0x00007f52579b76ba in start_thread (arg=0x7f5242635700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 32 (Thread 0x7f5242e36700 (LWP 7714)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc30ba in Cond::Wait (mutex=..., this=0x561ebebad200) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::entry (this=0x561ebebad180) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:208
#3  0x0000561eb3a8e3ed in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:101
#4  0x00007f52579b76ba in start_thread (arg=0x7f5242e36700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 31 (Thread 0x7f5243637700 (LWP 7713)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc5ef3 in Cond::Wait (mutex=..., this=0x561ebebac3c8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::run_local_delivery (this=0x561ebebac180) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:113
#3  0x0000561eb3a8e4cd in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:115
#4  0x00007f52579b76ba in start_thread (arg=0x7f5243637700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 30 (Thread 0x7f5243e38700 (LWP 7712)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc30ba in Cond::Wait (mutex=..., this=0x561ebebac200) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::entry (this=0x561ebebac180) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:208
#3  0x0000561eb3a8e3ed in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:101
#4  0x00007f52579b76ba in start_thread (arg=0x7f5243e38700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 29 (Thread 0x7f5244639700 (LWP 7711)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc5ef3 in Cond::Wait (mutex=..., this=0x561ebebacbc8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::run_local_delivery (this=0x561ebebac980) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:113
#3  0x0000561eb3a8e4cd in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:115
#4  0x00007f52579b76ba in start_thread (arg=0x7f5244639700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 28 (Thread 0x7f5244e3a700 (LWP 7710)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc30ba in Cond::Wait (mutex=..., this=0x561ebebaca00) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::entry (this=0x561ebebac980) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:208
#3  0x0000561eb3a8e3ed in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:101
#4  0x00007f52579b76ba in start_thread (arg=0x7f5244e3a700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 27 (Thread 0x7f524563b700 (LWP 7709)):
---Type <return> to continue, or q <return> to quit---
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc5ef3 in Cond::Wait (mutex=..., this=0x561ebebab3c8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::run_local_delivery (this=0x561ebebab180) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:113
#3  0x0000561eb3a8e4cd in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:115
#4  0x00007f52579b76ba in start_thread (arg=0x7f524563b700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 26 (Thread 0x7f5247eba700 (LWP 7708)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc30ba in Cond::Wait (mutex=..., this=0x561ebebab200) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::entry (this=0x561ebebab180) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:208
#3  0x0000561eb3a8e3ed in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:101
#4  0x00007f52579b76ba in start_thread (arg=0x7f5247eba700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 25 (Thread 0x7f52486bb700 (LWP 7707)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc5ef3 in Cond::Wait (mutex=..., this=0x561ebebabbc8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::run_local_delivery (this=0x561ebebab980) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:113
#3  0x0000561eb3a8e4cd in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:115
#4  0x00007f52579b76ba in start_thread (arg=0x7f52486bb700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 24 (Thread 0x7f5248ebc700 (LWP 7706)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc30ba in Cond::Wait (mutex=..., this=0x561ebebaba00) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::entry (this=0x561ebebab980) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:208
#3  0x0000561eb3a8e3ed in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:101
#4  0x00007f52579b76ba in start_thread (arg=0x7f5248ebc700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 23 (Thread 0x7f52496bd700 (LWP 7705)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc5ef3 in Cond::Wait (mutex=..., this=0x561ebebaabc8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::run_local_delivery (this=0x561ebebaa980) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:113
#3  0x0000561eb3a8e4cd in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:115
#4  0x00007f52579b76ba in start_thread (arg=0x7f52496bd700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 22 (Thread 0x7f5249ebe700 (LWP 7704)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc30ba in Cond::Wait (mutex=..., this=0x561ebebaaa00) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::entry (this=0x561ebebaa980) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:208
#3  0x0000561eb3a8e3ed in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:101
#4  0x00007f52579b76ba in start_thread (arg=0x7f5249ebe700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
---Type <return> to continue, or q <return> to quit---

Thread 21 (Thread 0x7f524a6bf700 (LWP 7703)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc5ef3 in Cond::Wait (mutex=..., this=0x561ebebaa3c8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::run_local_delivery (this=0x561ebebaa180) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:113
#3  0x0000561eb3a8e4cd in DispatchQueue::LocalDeliveryThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:115
#4  0x00007f52579b76ba in start_thread (arg=0x7f524a6bf700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 20 (Thread 0x7f524aec0700 (LWP 7702)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3cc30ba in Cond::Wait (mutex=..., this=0x561ebebaa200) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  DispatchQueue::entry (this=0x561ebebaa180) at /build/ceph-12.2.4/src/msg/DispatchQueue.cc:208
#3  0x0000561eb3a8e3ed in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/msg/DispatchQueue.h:101
#4  0x00007f52579b76ba in start_thread (arg=0x7f524aec0700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 19 (Thread 0x7f524d64b700 (LWP 7701)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb383df0d in Cond::WaitUntil (when=..., mutex=..., this=0x561ebeb98bf8) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebeb98bf8) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  BlueStore::MempoolThread::entry (this=0x561ebeb98bc8) at /build/ceph-12.2.4/src/os/bluestore/BlueStore.cc:3374
#4  0x00007f52579b76ba in start_thread (arg=0x7f524d64b700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 18 (Thread 0x7f524763f700 (LWP 7700)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f52572c391c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x0000561eb3874ac0 in BlueStore::_kv_finalize_thread (this=0x561ebeb98000) at /build/ceph-12.2.4/src/os/bluestore/BlueStore.cc:8706
#3  0x0000561eb38cd83d in BlueStore::KVFinalizeThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/os/bluestore/BlueStore.h:1790
#4  0x00007f52579b76ba in start_thread (arg=0x7f524763f700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 17 (Thread 0x7f5246e3e700 (LWP 7699)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f52572c391c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x0000561eb389057a in BlueStore::_kv_sync_thread (this=<optimized out>) at /build/ceph-12.2.4/src/os/bluestore/BlueStore.cc:8457
#3  0x0000561eb38d308d in BlueStore::KVSyncThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/os/bluestore/BlueStore.h:1782
#4  0x00007f52579b76ba in start_thread (arg=0x7f5246e3e700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 16 (Thread 0x7f524663d700 (LWP 7698)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39fa479 in Cond::Wait (mutex=..., this=0x561ebe8daab8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  Finisher::finisher_thread_entry (this=0x561ebe8daa40) at /build/ceph-12.2.4/src/common/Finisher.cc:101
#3  0x00007f52579b76ba in start_thread (arg=0x7f524663d700) at pthread_create.c:333
---Type <return> to continue, or q <return> to quit---
#4  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 15 (Thread 0x7f5245e3c700 (LWP 7697)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39fa479 in Cond::Wait (mutex=..., this=0x561ebeb98548) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  Finisher::finisher_thread_entry (this=0x561ebeb984d0) at /build/ceph-12.2.4/src/common/Finisher.cc:101
#3  0x00007f52579b76ba in start_thread (arg=0x7f5245e3c700) at pthread_create.c:333
#4  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 14 (Thread 0x7f524de4c700 (LWP 7679)):
#0  0x00007f5258a4164a in ?? () from target:/lib/x86_64-linux-gnu/libaio.so.1
#1  0x0000561eb39b1d57 in aio_queue_t::get_next_completed (this=this@entry=0x561ebe901378, timeout_ms=<optimized out>, paio=paio@entry=0x7f524de4a5d0, max=16)
    at /build/ceph-12.2.4/src/os/bluestore/aio.cc:78
#2  0x0000561eb399a3dc in KernelDevice::_aio_thread (this=0x561ebe901200) at /build/ceph-12.2.4/src/os/bluestore/KernelDevice.cc:350
#3  0x0000561eb39a09cd in KernelDevice::AioCompletionThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/os/bluestore/KernelDevice.h:49
#4  0x00007f52579b76ba in start_thread (arg=0x7f524de4c700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 13 (Thread 0x7f524e64d700 (LWP 7678)):
#0  0x00007f5258a4164a in ?? () from target:/lib/x86_64-linux-gnu/libaio.so.1
#1  0x0000561eb39b1d57 in aio_queue_t::get_next_completed (this=this@entry=0x561ebe901138, timeout_ms=<optimized out>, paio=paio@entry=0x7f524e64b5d0, max=16)
    at /build/ceph-12.2.4/src/os/bluestore/aio.cc:78
#2  0x0000561eb399a3dc in KernelDevice::_aio_thread (this=0x561ebe900fc0) at /build/ceph-12.2.4/src/os/bluestore/KernelDevice.cc:350
#3  0x0000561eb39a09cd in KernelDevice::AioCompletionThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/os/bluestore/KernelDevice.h:49
#4  0x00007f52579b76ba in start_thread (arg=0x7f524e64d700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 12 (Thread 0x7f524ee4e700 (LWP 7677)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39f7c55 in Cond::Wait (mutex=..., this=0x561ebed5cd00) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  SafeTimer::timer_thread (this=0x561ebed5ccf0) at /build/ceph-12.2.4/src/common/Timer.cc:108
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f524ee4e700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 11 (Thread 0x7f524f64f700 (LWP 7676)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb39f7c55 in Cond::Wait (mutex=..., this=0x561ebed5cbb8) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  SafeTimer::timer_thread (this=0x561ebed5cba8) at /build/ceph-12.2.4/src/common/Timer.cc:108
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f524f64f700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 10 (Thread 0x7f524fe50700 (LWP 7675)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb39f801f in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5a1d0) at /build/ceph-12.2.4/src/common/Cond.h:64
---Type <return> to continue, or q <return> to quit---
#2  SafeTimer::timer_thread (this=0x561ebed5a1c0) at /build/ceph-12.2.4/src/common/Timer.cc:110
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f524fe50700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 9 (Thread 0x7f5250651700 (LWP 7674)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb39f801f in Cond::WaitUntil (when=..., mutex=..., this=0x561ebed5a098) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  SafeTimer::timer_thread (this=0x561ebed5a088) at /build/ceph-12.2.4/src/common/Timer.cc:110
#3  0x0000561eb39f8d9d in SafeTimerThread::entry (this=<optimized out>) at /build/ceph-12.2.4/src/common/Timer.cc:30
#4  0x00007f52579b76ba in start_thread (arg=0x7f5250651700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 8 (Thread 0x7f5250e52700 (LWP 7673)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb345cee2 in __gthread_cond_timedwait (__abs_timeout=0x7f5250e507e0, __mutex=<optimized out>, __cond=0x561ebecaba80)
    at /usr/include/x86_64-linux-gnu/c++/5/bits/gthr-default.h:871
#2  std::condition_variable::__wait_until_impl<std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > (__atime=<synthetic pointer>, __lock=..., 
    this=0x561ebecaba80) at /usr/include/c++/5/condition_variable:165
#3  std::condition_variable::wait_until<ceph::time_detail::mono_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > > (__atime=..., __lock=..., 
    this=0x561ebecaba80) at /usr/include/c++/5/condition_variable:118
#4  ceph::timer_detail::timer<ceph::time_detail::mono_clock>::timer_thread (this=0x561ebecaba18) at /build/ceph-12.2.4/src/common/ceph_timer.h:144
#5  0x00007f52572c8c80 in ?? () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f52579b76ba in start_thread (arg=0x7f5250e52700) at pthread_create.c:333
#7  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 7 (Thread 0x7f5251ea0700 (LWP 7668)):
#0  0x00007f5256a2274d in poll () at ../sysdeps/unix/syscall-template.S:84
#1  0x0000561eb39e1015 in poll (__timeout=-1, __nfds=2, __fds=0x7f5251e9e6d0) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
#2  AdminSocket::entry (this=0x561ebe8f8540) at /build/ceph-12.2.4/src/common/admin_socket.cc:250
#3  0x00007f52579b76ba in start_thread (arg=0x7f5251ea0700) at pthread_create.c:333
#4  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 6 (Thread 0x7f52526a1700 (LWP 7667)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x0000561eb3b05210 in Cond::WaitUntil (when=..., mutex=..., this=0x561ebe8f0368) at /build/ceph-12.2.4/src/common/Cond.h:64
#2  Cond::WaitInterval (interval=..., mutex=..., this=0x561ebe8f0368) at /build/ceph-12.2.4/src/common/Cond.h:73
#3  CephContextServiceThread::entry (this=0x561ebe8f02d0) at /build/ceph-12.2.4/src/common/ceph_context.cc:135
#4  0x00007f52579b76ba in start_thread (arg=0x7f52526a1700) at pthread_create.c:333
#5  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 5 (Thread 0x7f5253640700 (LWP 7666)):
#0  0x00007f5256a2ea13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
#1  0x0000561eb3c6d789 in EpollDriver::event_wait (this=0x561ebebe0300, fired_events=..., tvp=<optimized out>) at /build/ceph-12.2.4/src/msg/async/EventEpoll.cc:114
#2  0x0000561eb3a9e591 in EventCenter::process_events (this=this@entry=0x561ebe900740, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, 
    working_dur=working_dur@entry=0x7f525363e6a0) at /build/ceph-12.2.4/src/msg/async/Event.cc:395
---Type <return> to continue, or q <return> to quit---
#3  0x0000561eb3aa2c98 in NetworkStack::<lambda()>::operator()(void) const (__closure=0x561ebebaed98) at /build/ceph-12.2.4/src/msg/async/Stack.cc:51
#4  0x00007f52572c8c80 in ?? () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f52579b76ba in start_thread (arg=0x7f5253640700) at pthread_create.c:333
#6  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 4 (Thread 0x7f5253e41700 (LWP 7665)):
#0  0x00007f5256a2ea13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
#1  0x0000561eb3c6d789 in EpollDriver::event_wait (this=0x561ebebe0120, fired_events=..., tvp=<optimized out>) at /build/ceph-12.2.4/src/msg/async/EventEpoll.cc:114
#2  0x0000561eb3a9e591 in EventCenter::process_events (this=this@entry=0x561ebe900bc0, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, 
    working_dur=working_dur@entry=0x7f5253e3f6a0) at /build/ceph-12.2.4/src/msg/async/Event.cc:395
#3  0x0000561eb3aa2c98 in NetworkStack::<lambda()>::operator()(void) const (__closure=0x561ebebaed48) at /build/ceph-12.2.4/src/msg/async/Stack.cc:51
#4  0x00007f52572c8c80 in ?? () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f52579b76ba in start_thread (arg=0x7f5253e41700) at pthread_create.c:333
#6  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 3 (Thread 0x7f5254642700 (LWP 7664)):
#0  0x00007f5256a2ea13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84
#1  0x0000561eb3c6d789 in EpollDriver::event_wait (this=0x561ebeb34090, fired_events=..., tvp=<optimized out>) at /build/ceph-12.2.4/src/msg/async/EventEpoll.cc:114
#2  0x0000561eb3a9e591 in EventCenter::process_events (this=this@entry=0x561ebe900980, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, 
    working_dur=working_dur@entry=0x7f52546406a0) at /build/ceph-12.2.4/src/msg/async/Event.cc:395
#3  0x0000561eb3aa2c98 in NetworkStack::<lambda()>::operator()(void) const (__closure=0x561ebebaecf8) at /build/ceph-12.2.4/src/msg/async/Stack.cc:51
#4  0x00007f52572c8c80 in ?? () from target:/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f52579b76ba in start_thread (arg=0x7f5254642700) at pthread_create.c:333
#6  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7f5255683700 (LWP 7663)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3a0a6db in ceph::logging::Log::entry (this=0x561ebe8f8000) at /build/ceph-12.2.4/src/log/Log.cc:459
#2  0x00007f52579b76ba in start_thread (arg=0x7f5255683700) at pthread_create.c:333
#3  0x00007f5256a2e41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7f52594dfe00 (LWP 7659)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000561eb3a9a067 in Cond::Wait (mutex=..., this=0x561ebebaa760) at /build/ceph-12.2.4/src/common/Cond.h:48
#2  AsyncMessenger::wait (this=0x561ebebaa000) at /build/ceph-12.2.4/src/msg/async/AsyncMessenger.cc:481
#3  0x0000561eb3336f9a in main (argc=<optimized out>, argv=<optimized out>) at /build/ceph-12.2.4/src/ceph_osd.cc:656
Actions #2

Updated by Niklas Hambuechen about 6 years ago

For the record, I installed the following debugging packages for gdb stack traces:

apt-get install libc6-dbg ceph-base-dbg ceph-common-dbg ceph-mds-dbg ceph-mgr-dbg ceph-mon-dbg ceph-osd-dbg libcephfs2-dbg python-cephfs-dbg
Actions #3

Updated by Niklas Hambuechen about 6 years ago

On https://forum.proxmox.com/threads/increase-ceph-recovery-speed.36728/ people reported the same number as me of 10 objects per second:

... but the recovery speed is quite low (5600kb/s ~10 objects/s).

The solution suggested there,

ceph tell 'osd.*' injectargs '--osd-max-backfills 16'
ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'

did not help in my case; recovery speed remained at 10 objects/s.

Neither brought setting them to 100x their defaults, as in

ceph tell 'osd.*' config set osd_max_backfills 100
ceph tell 'osd.*' config set osd_recovery_max_active 300

I have further tried changing the device class from "hdd" to "ssd" (this made no improvement).

Further adding

ceph tell 'osd.*' config set osd_recovery_max_single_start 100

(100x its default) improved the situation to around 600 object/s (which is still brutally slow).

Another increase by factor 10 using

ceph tell 'osd.*' config set osd_recovery_max_single_start 1000

brought no further improvement. Increasing two other parameters by factor 100x from their default,

ceph tell 'osd.*' config set osd_backfill_scan_max 51200
ceph tell 'osd.*' config set osd_backfill_scan_min 6400

brought no futher improvement.

It's unclear to me what the bottleneck is; the machines remain idle.

Updated by Niklas Hambuechen about 6 years ago

Attached are two GDB runs of a sender node.

In the release build there were many values "<optimized out>", so I rebuilt ceph with "-O0 -g" on that node, and that way created the second GDB output.

Actions #5

Updated by Niklas Hambuechen about 6 years ago

You can find a core dump of the -O0 version created with GDB at http://nh2.me/ceph-issue-23595-osd-O0.core.xz

Actions #6

Updated by Niklas Hambuechen about 6 years ago

I have now tested with only the 6*1GB files, having deleted the 270k empty files from cephfs.

I continue to see extremely low recovery speeds, despite there now being way fewer objects:

2018-04-09 16:52:43.897882 mon.ceph3 [WRN] Health check update: Degraded data redundancy: 1893/6054 objects degraded (31.269%), 134 pgs degraded, 134 pgs undersized (PG_DEGRADED)
2018-04-09 16:52:48.898497 mon.ceph3 [WRN] Health check update: Degraded data redundancy: 1880/6054 objects degraded (31.054%), 133 pgs degraded, 133 pgs undersized (PG_DEGRADED)
2018-04-09 16:52:53.899267 mon.ceph3 [WRN] Health check update: Degraded data redundancy: 1879/6054 objects degraded (31.037%), 133 pgs degraded, 133 pgs undersized (PG_DEGRADED)
2018-04-09 16:52:58.900634 mon.ceph3 [WRN] Health check update: Degraded data redundancy: 1859/6054 objects degraded (30.707%), 133 pgs degraded, 133 pgs undersized (PG_DEGRADED)
2018-04-09 16:53:03.908650 mon.ceph3 [WRN] Health check update: Degraded data redundancy: 1817/6054 objects degraded (30.013%), 132 pgs degraded, 133 pgs undersized (PG_DEGRADED)
2018-04-09 16:53:08.911142 mon.ceph3 [WRN] Health check update: Degraded data redundancy: 1804/6054 objects degraded (29.798%), 131 pgs degraded, 132 pgs undersized (PG_DEGRADED)

Also to note that cephfs outside of recovery is performing great, at 145 MB/s when writing the large files with dd.

Actions #7

Updated by Niklas Hambuechen about 6 years ago

OK, if I only have the 6 large files in the cephfs AND set the options

ceph osd crush rm-device-class osd.0
ceph osd crush set-device-class ssd osd.0

ceph tell 'osd.*' config set osd_max_backfills 100
ceph tell 'osd.*' config set osd_recovery_max_active 300
ceph tell 'osd.*' config set osd_recovery_max_single_start 1000
ceph tell 'osd.*' config set osd_backfill_scan_max 51200
ceph tell 'osd.*' config set osd_backfill_scan_min 6400

then recovery speeds up to a constant 40 MB/s without stops in between.

This suggests that

  • there's indeed some throttling going on in ceph
  • this throttling is not set to a good default given how slow the recovery is even when I have only 6 large files and the systems are totally idle
  • while the currently available configuration options make it possible to get reasonably fast recovery for large files, this doesn't work for small files (or at least I haven't found out how yet), as even increasing all these limits by 100x did not give good results for small files; perhaps there are some internal limits at which these settings are capped?
Actions #8

Updated by Niklas Hambuechen about 6 years ago

I have it figured out!

The issue was "osd_recovery_sleep_hdd", which defaults to 0.1 seconds.

After setting

ceph tell 'osd.*' config set osd_recovery_sleep_hdd 0

the recovery of the OSD with 6*1GB files on it sped up to 145 MB/s, and the recovery of the OSD with 6*1GB files plus 450k empty files sped up to ~50 MB/s.

It is not clear yet why this is the case, given that I had done

ceph osd crush rm-device-class osd.0
ceph osd crush set-device-class ssd osd.0

which I assume should have set the device type to SSD and thus "osd_recovery_sleep_ssd" should apply which defaults to 0 seconds.

Maybe this is not true?

Does ceph determine SSD vs HDD by some other means than the device class?
Or does setting the evice class not correctly switch to "osd_recovery_sleep_ssd" as the sleep limit?

I think it is also worth discussing whether "osd_recovery_sleep_hdd = 0.1" is a good default (or whether this method of limiting impact on clients during recovery is a good one, maybe it could be controlled based on actual client load instead of inserting a sleep).

And I think documentation should be updated to point these things out more clearly, they are a big roadblock when trying to get started with ceph, as basic functionality does not work out of the box (the default recovery speed is so slow that it looks like a bug much more than a bad default).

Below you can find my new recovery log as per "ceph -w":

2018-04-09 18:26:08.619414 mon.ceph2 [INF] osd.0 marked itself down
2018-04-09 18:26:08.671438 mon.ceph2 [WRN] Health check failed: 1 osds down (OSD_DOWN)
2018-04-09 18:26:08.671522 mon.ceph2 [WRN] Health check failed: 1 host (1 osds) down (OSD_HOST_DOWN)
2018-04-09 18:26:11.304147 mon.ceph2 [INF] Health check cleared: OSD_DOWN (was: 1 osds down)
2018-04-09 18:26:11.304190 mon.ceph2 [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (1 osds) down)
2018-04-09 18:26:11.869287 mon.ceph2 [WRN] Health check failed: 1 osds down (OSD_DOWN)
2018-04-09 18:26:11.980175 mon.ceph2 [WRN] Health check failed: Reduced data availability: 5 pgs inactive, 99 pgs peering (PG_AVAILABILITY)
2018-04-09 18:26:13.000008 mon.ceph2 [WRN] Health check failed: Degraded data redundancy: 188683/1391130 objects degraded (13.563%), 76 pgs degraded (PG_DEGRADED)
2018-04-09 18:26:15.030851 mon.ceph2 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 2 pgs inactive, 83 pgs peering)
2018-04-09 18:26:20.679828 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 463710/1391130 objects degraded (33.333%), 174 pgs degraded (PG_DEGRADED)
2018-04-09 18:26:21.301934 mon.ceph2 [WRN] Health check failed: 1 host (1 osds) down (OSD_HOST_DOWN)
2018-04-09 18:26:22.324977 mon.ceph2 [INF] Health check cleared: OSD_DOWN (was: 1 osds down)
2018-04-09 18:26:22.325040 mon.ceph2 [INF] Health check cleared: OSD_HOST_DOWN (was: 1 host (1 osds) down)
2018-04-09 18:26:22.339579 mon.ceph2 [INF] osd.0 88.99.127.192:6801/14021 boot
2018-04-09 18:26:25.374914 mon.ceph2 [WRN] Health check failed: Reduced data availability: 3 pgs inactive, 8 pgs peering (PG_AVAILABILITY)
2018-04-09 18:26:25.687636 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 315256/1391130 objects degraded (22.662%), 104 pgs degraded (PG_DEGRADED)
2018-04-09 18:26:28.915497 mon.ceph2 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 3 pgs inactive, 8 pgs peering)
2018-04-09 18:26:30.695842 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 435428/1391130 objects degraded (31.300%), 151 pgs degraded (PG_DEGRADED)
2018-04-09 18:26:35.698537 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 446081/1391130 objects degraded (32.066%), 151 pgs degraded (PG_DEGRADED)
2018-04-09 18:26:40.702339 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 421209/1391130 objects degraded (30.278%), 145 pgs degraded (PG_DEGRADED)
2018-04-09 18:26:43.710645 mon.ceph2 [WRN] Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)
2018-04-09 18:26:45.702876 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 402752/1391130 objects degraded (28.951%), 139 pgs degraded (PG_DEGRADED)
2018-04-09 18:26:49.743469 mon.ceph2 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 1 pg peering)
2018-04-09 18:26:50.705600 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 380919/1391130 objects degraded (27.382%), 131 pgs degraded (PG_DEGRADED)
2018-04-09 18:26:55.708053 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 366474/1391130 objects degraded (26.344%), 123 pgs degraded (PG_DEGRADED)
2018-04-09 18:27:00.711025 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 341144/1391130 objects degraded (24.523%), 115 pgs degraded (PG_DEGRADED)
2018-04-09 18:27:05.715300 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 312942/1391130 objects degraded (22.496%), 108 pgs degraded (PG_DEGRADED)
2018-04-09 18:27:10.718042 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 297805/1391130 objects degraded (21.407%), 103 pgs degraded (PG_DEGRADED)
2018-04-09 18:27:15.721038 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 268624/1391130 objects degraded (19.310%), 95 pgs degraded (PG_DEGRADED)
2018-04-09 18:27:20.723297 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 261497/1391130 objects degraded (18.797%), 93 pgs degraded (PG_DEGRADED)
2018-04-09 18:27:25.726002 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 228480/1391130 objects degraded (16.424%), 82 pgs degraded, 76 pgs undersized (PG_DEGRADED)
2018-04-09 18:27:30.740155 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 217528/1391130 objects degraded (15.637%), 77 pgs degraded, 71 pgs undersized (PG_DEGRADED)
2018-04-09 18:27:35.752691 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 185178/1391130 objects degraded (13.311%), 67 pgs degraded, 62 pgs undersized (PG_DEGRADED)
2018-04-09 18:27:40.753107 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 181583/1391130 objects degraded (13.053%), 66 pgs degraded, 60 pgs undersized (PG_DEGRADED)
2018-04-09 18:27:45.755585 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 156377/1391130 objects degraded (11.241%), 57 pgs degraded, 52 pgs undersized (PG_DEGRADED)
2018-04-09 18:27:50.762326 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 145430/1391130 objects degraded (10.454%), 54 pgs degraded, 49 pgs undersized (PG_DEGRADED)
2018-04-09 18:27:55.767372 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 141798/1391130 objects degraded (10.193%), 51 pgs degraded, 47 pgs undersized (PG_DEGRADED)
2018-04-09 18:28:00.770534 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 105671/1391130 objects degraded (7.596%), 38 pgs degraded, 33 pgs undersized (PG_DEGRADED)
2018-04-09 18:28:05.773293 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 76598/1391130 objects degraded (5.506%), 27 pgs degraded, 23 pgs undersized (PG_DEGRADED)
2018-04-09 18:28:10.775854 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 69441/1391130 objects degraded (4.992%), 25 pgs degraded, 21 pgs undersized (PG_DEGRADED)
2018-04-09 18:28:15.776423 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 36634/1391130 objects degraded (2.633%), 15 pgs degraded, 12 pgs undersized (PG_DEGRADED)
2018-04-09 18:28:20.776955 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 22230/1391130 objects degraded (1.598%), 11 pgs degraded, 8 pgs undersized (PG_DEGRADED)
2018-04-09 18:28:25.778219 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 14815/1391130 objects degraded (1.065%), 9 pgs degraded, 6 pgs undersized (PG_DEGRADED)
2018-04-09 18:28:30.778683 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 241/1391130 objects degraded (0.017%), 4 pgs degraded (PG_DEGRADED)
2018-04-09 18:28:35.779084 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 196/1391130 objects degraded (0.014%), 3 pgs degraded (PG_DEGRADED)
2018-04-09 18:28:40.779529 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 152/1391130 objects degraded (0.011%), 3 pgs degraded (PG_DEGRADED)
2018-04-09 18:28:45.780078 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 106/1391130 objects degraded (0.008%), 2 pgs degraded (PG_DEGRADED)
2018-04-09 18:28:50.781230 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 62/1391130 objects degraded (0.004%), 2 pgs degraded (PG_DEGRADED)
2018-04-09 18:28:55.781665 mon.ceph2 [WRN] Health check update: Degraded data redundancy: 17/1391130 objects degraded (0.001%), 1 pg degraded (PG_DEGRADED)
2018-04-09 18:28:59.082001 mon.ceph2 [INF] Health check cleared: PG_DEGRADED (was: Degraded data redundancy: 17/1391130 objects degraded (0.001%), 1 pg degraded)
Actions #9

Updated by Patrick Donnelly about 6 years ago

  • Project changed from Ceph to RADOS
  • Subject changed from Recovery/backfill is extremely slow to osd: recovery/backfill is extremely slow
  • Source set to Community (user)
  • Component(RADOS) OSD added
Actions #10

Updated by jianpeng ma about 6 years ago

check hdd or ssd by code at osd started and not changed after starting.

I think we need increase the log level for sleep which make user firstly got from message.

Actions #11

Updated by Greg Farnum about 6 years ago

  • Status changed from New to Duplicate

https://tracker.ceph.com/issues/23141

Sorry you ran into this, it's a bug in BlueStore/BlueFS. The fix will be in the next Luminous release and there may be more details from the thread "[ceph-users] SSD Bluestore Backfills Slow".

Glad you found the workaround! :)

Actions #12

Updated by Niklas Hambuechen about 6 years ago

@Greg Farnum Farnum: Ah, great that part is already handled!

What about my other questions though, like

I think it is also worth discussing whether "osd_recovery_sleep_hdd = 0.1" is a good default (or whether this method of limiting impact on clients during recovery is a good one, maybe it could be controlled based on actual client load instead of inserting a sleep).

Even on HDDs, this limit seems to be way too strict, as I'm quite sure my HDD servers could also recover at ~120 MB/s instead of 500 KB/s when there are many small files.

So I think while your fix will improve the situation for SSDs, it will not for HDDs, where there's still a lot to gain.

Actions

Also available in: Atom PDF