Project

General

Profile

Actions

Bug #63229

open

Default values for paxos trimming cause heavy writes to mon store

Added by Eugen Block 7 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After following the discussion in the ceph-users mailing list [1] I took a closer look and with the help of Mykola we may have found some tuning possibility to reduce the amount of writes to the mon store.
In a production cluster this can have a heavy impact. Because one of our customers still runs on Octopus I did the analysis on an Octopus cluster with a single MON in my lab environment. The reports in the ML suggest that other versions are impacted as well.)
Just for reference, here are some values from iostat with default paxos settings collected over a period of 5 minutes (multiple times to get an impression if the high values are reproducable). To simulate some load I have added a rbd snapshot schedule of 1m for around 300 images to be mirrored onto a second cluster (but no real IO on the images).

# iotop -ao -bn 2 -d 300 2>&1 | grep -E "TID|ceph-mon" 
2501306 be/4 ceph         0.00 B   292.00 K ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [log] 
2501310 be/4 ceph         0.00 B   673.29 M ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [rocksdb:low0] 
2501311 be/4 ceph         0.00 B    22.51 M ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [rocksdb:high0] 
2501330 be/4 ceph         4.00 K    23.85 M ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [fn_monstore] 
2501336 be/4 ceph         0.00 B     7.80 M ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [safe_timer]

Since each paxos trim process initiates a manual compaction (default: mon_compact_on_trim = true) I increased these values:

paxos_service_trim_max = 1000 (default 500)
paxos_service_trim_min = 500 (default 250)
paxos_trim_max = 1000 (default 500)
paxos_trim_min = 500 (default 250)

This reduced the writes to the mon store drastically:

2501306 be/4 ceph         0.00 B   264.00 K ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [log] 
2501310 be/4 ceph        22.12 M   115.11 M ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [rocksdb:low0] 
2501311 be/4 ceph         4.00 K    13.94 M ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [rocksdb:high0] 
2501330 be/4 ceph         9.14 M    23.16 M ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [fn_monstore] 
2501336 be/4 ceph         0.00 B     7.71 M ?unavailable? ceph-mon -f --cluster ceph --id alsterdorf --setuser ceph --setgroup ceph [safe_timer]

The values are fluctuating a bit, of course, but it appears that writes can be reduced by a factor of 3 or 4. I don't have production values yet, I will suggest this approach to a customer for their secondary cluster. But the question is if the default values should be increased since they don't seem to be suitable for production load (and most likely won't hurt smaller clusters).

[1] https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/XGCI2LFW5RH3GUOQFJ542ISCSZH3FRX2/

Actions #1

Updated by Radoslaw Zarzynski 7 months ago

How about bring this to Ceph Perf Weekly call?
I'm going to ping Mark Nelson as well.

Actions #2

Updated by Eugen Block 7 months ago

Radoslaw Zarzynski wrote:

How about bring this to Ceph Perf Weekly call?
I'm going to ping Mark Nelson as well.

Sure, why not. According to https://ceph.io/en/community/meetups/ it's this Thursday at 8 am PDT, correct? That works for me.

Actions

Also available in: Atom PDF