Project

General

Profile

Bug #24159

Monitor down when large store data needs to compact triggered by ceph tell mon.xx compact command

Added by 相洋 于 7 months ago. Updated 7 months ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
Start date:
05/17/2018
Due date:
05/31/2018
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/jewel-x
Component(RADOS):
Monitor
Pull request ID:

Description

I have met a monitor problem with capacity too large in our production environment.

This logical volume for monitor is 100GB, the capacity can grow to 95GB, and then go down. Sometimes, the monitor process can not start

,we have to rebuild the monitor.

Then we set a crontab task, every week we compact the monitor at a assigned point time. The command is "ceph tell mon.xx

compact", While the monitor is compacting(it can last for 80 seconds), the monitor is down by lease timeout. When the compact task

complete, the monitors vote again ,the monitor is up and active.

In monitor message dispatch code, while one command is executing , it must have the dispatch lock. So indeed all command are executing

synchronously. If one command is executing for long time, others command must wait.


Related issues

Duplicates RADOS - Bug #24160: Monitor down when large store data needs to compact triggered by ceph tell mon.xx compact command Resolved 05/17/2018 05/31/2018

History

#1 Updated by Nathan Cutler 7 months ago

  • Duplicates Bug #24160: Monitor down when large store data needs to compact triggered by ceph tell mon.xx compact command added

#2 Updated by Nathan Cutler 7 months ago

  • Status changed from New to Duplicate

#3 Updated by Joao Eduardo Luis 7 months ago

  • Project changed from Ceph to RADOS
  • Category changed from Monitor to Correctness/Safety
  • Component(RADOS) Monitor added

Also available in: Atom PDF