Bug #24160
closedMonitor down when large store data needs to compact triggered by ceph tell mon.xx compact command
0%
Description
I have met a monitor problem with capacity too large in our production environment.
This logical volume for monitor is 100GB, the capacity can grow to 95GB, and then go down. Sometimes, the monitor process can not start
,we have to rebuild the monitor.
Then we set a crontab task, every week we compact the monitor at a assigned point time. The command is "ceph tell mon.xx
compact", While the monitor is compacting(it can last for 80 seconds), the monitor is down by lease timeout. When the compact task
complete, the monitors vote again ,the monitor is up and active.
In monitor message dispatch code, while one command is executing , it must have the dispatch lock. So indeed all command are executing
synchronously. If one command is executing for long time, others command must wait.