Project

General

Profile

Actions

Bug #49161

open

common: ceph cluster and MON log being spammed with debug messages which cannot be turned off

Added by Janek Bevendorff about 3 years ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

My MONs started running out of memory memory due to massive log spam. I had a /var/log/ceph-mon.*.log file of about 3.5GB and a /var/log/ceph/ceph.log cluster log of more than 60GB (!!).

Any debug_* settings were either at their defaults or at something such as 1/5 or even 0/0. I dumped the effective MON configs and found the setting mon_cluster_log_file_level, which I never set explicitly, but somehow its default value was debug all of a sudden (documentation says it should be info). Once I set this setting to warning, the spam to ceph.log stopped, but the MON daemon logs are still being spammed with messages like these:

Feb 04 12:18:19 XXX ceph-mon[25257]: 2021-02-04T12:18:19.267+0100 7f295eb0a700  0 log_channel(cluster) log [DBG] : mds.1 [v2:XXX:6800/3706045336,v1:XXX:6801/3706045336] up:active
Feb 04 12:18:19 XXX ceph-mon[25257]: 2021-02-04T12:18:19.267+0100 7f295eb0a700  0 log_channel(cluster) log [DBG] : fsmap cephfs.storage:3 {0=XXX=up:active,1=XXX=up:active,2=XXX=up:active} 3 up:standby-replay 1 up:standby
Feb 04 12:18:10 XXX ceph-mon[25257]: 2021-02-04T12:18:10.631+0100 7f296030d700  0 mon.XXX@0(leader) e22 handle_command mon_command({"prefix": "status"} v 0) v1                              
Feb 04 12:18:10 XXX ceph-mon[25257]: 2021-02-04T12:18:10.631+0100 7f296030d700  0 log_channel(audit) log [DBG] : from='client.? XXX:0/186853831' entity='client.admin' cmd=[{"prefix": "stat
us"}]: dispatch

So, both the cluster and audit channel are still writing to the mon log wit debug level. Setting clog_to_monitors to false doesn't change anything. I did not have this issue previously.

Here is my logging config:

global                            advanced  clog_to_syslog_level                                 warning                                                         
global                            basic     err_to_syslog                                        true                                                            
global                            basic     log_to_file                                          false                                                           
global                            basic     log_to_stderr                                        false                                                           
global                            basic     log_to_syslog                                        true                                                            
global                            advanced  mon_cluster_log_file_level                           error                                                           
global                            advanced  mon_cluster_log_to_file                              false                                                           
global                            advanced  mon_cluster_log_to_stderr                            false                                                           
global                            advanced  mon_cluster_log_to_syslog                            false                                                           
global                            advanced  mon_cluster_log_to_syslog_level                      warning 

The debug_* settings are all at their defaults right now (for debug_mon that is 1/5).

I completely disabled logging to files at the moment and redirect everything to syslog/journald, but the issue occurs regardless of the logging facility.

Actions

Also available in: Atom PDF