Project

General

Profile

Actions

Bug #53485

open

monstore: logm entries are not garbage collected

Added by Daniel Poelzleithner over 2 years ago. Updated almost 2 years ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We had to run a ceph cluster with a damaged cephfs for a while that got deleted already. We suspect this was the culpit, but we experienced heavy growth on the
ceph-mon database. Our small 6 server cluster already has a 45 GB monstore.

ceph-monstore-tool /var/lib/ceph/mon/ceph-server6/ dump-keys | awk '{print $1}' | uniq -c                                                                        
    147 auth
      2 config
     11 health
1546347 logm
      2 mkfs
      3 mon_sync
      7 monitor
      1 monitor_store
   2459 paxos
      ...                                                                                                    

There seems to be no limit in place for the number of logm messages stored, nor seems there be a garbage collection mechanism. Partially, the monstore was growing so fast, that we feared
that the server runs out of storage just by logm spam.

Actions #1

Updated by Neha Ojha over 2 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (Monitor)
Actions #2

Updated by Loïc Dachary over 2 years ago

  • Target version deleted (v15.2.15)
Actions #3

Updated by Daniel Poelzleithner over 2 years ago

We just grew to wopping 80 gb metadata server. I'm out ideas here and don't know how to stop the growth.
Somebody added some logging and no cleaning. The resync of a mon server takes hours now and I fear the worst.

This is our config:

[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         fsid = d7c5c9c7-a227-4e33-ab43-3f4aa1eb0630
         mon_allow_pool_delete = true
         mon_host = 172.20.77.6 172.20.77.8 172.20.77.9 172.20.77.5 172.20.77.3
         osd_journal_size = 5120
         osd_pool_default_min_size = 2
         osd_pool_default_pg_autoscale_mode = on
         osd_pool_default_size = 3

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
         beacon_grace = 800
         # debug_default = 10/10
         keyring = /var/lib/ceph/mds/ceph-$id/keyring
         mds_beacon_grace = 800

[mon]
         # debug_default = 10/10
         keyring = /var/lib/ceph/mon/ceph-$id/keyring
         ms_bind_msgr1 = true
         ms_bind_msgr2 = true
         # mon_compact_on_start = true

[osd]
         osd_max_backfills = 16
         osd_recovery_max_active = 4

[mds.server5]
         host = server5
         mds_standby_for_name = pve

[mds.server6]
         host = server6
         mds_standby_for_name = pve

[mds.server8]
         host = server8
         mds_standby_for_name = pve

[mon.server3]
         public_addr = 172.20.77.3

[mon.server6]
         public_addr = 172.20.77.6

[mon.server8]
         public_addr = 172.20.77.8

Actions #4

Updated by Neha Ojha over 2 years ago

  • Assignee set to Prashant D
Actions #5

Updated by Daniel Poelzleithner over 2 years ago

I changed the paxos debug level to 20 and fond this in mon store log:

2021-12-16T18:35:07.814+0100 7fec66e79700 20 mon.server6@0(leader).paxosservice(logm 56666064..29067286) maybe_trim 56666064~29067236
2021-12-16T18:35:07.814+0100 7fec66e79700 10 mon.server6@0(leader).paxosservice(logm 56666064..29067286) maybe_trim trim_to 29067236 < first_committed 56666064

56666064 is the get_first_committed version of paxos service
29067236 is calculated in LogMonitor.cc as:

  unsigned max = g_conf()->mon_max_log_epochs;
  version_t version = get_last_committed();
  if (version > max)
    return version - max;

This means that for some reason first_commited is now a lot larger then the last commit version on disc. This is why the old versions are not deleted.

This internal state corruption causes the monstore to grow indefinitly.

Actions #6

Updated by Daniel Poelzleithner over 2 years ago

fix is in progress

Actions #7

Updated by Peter Razumovsky almost 2 years ago

We observed this several times on a customer side. each out of 3 mon store.db rapidly growing, had tons of logm keys and 1.4TB (!!!). We started restarting 200 osds and on some osd mon suddenly started to trim it's size from 1.4TB to 100MB. Therefore please fix this as soon as possible because we already faced this 4 times.

Actions #8

Updated by Peter Razumovsky almost 2 years ago

Sorry, forgot to add - we faced this issue on v15.2.13 and on v14.2.22 as well.

Actions #10

Updated by Radoslaw Zarzynski almost 2 years ago

  • Pull request ID set to 44511
Actions #11

Updated by Neha Ojha almost 2 years ago

  • Status changed from New to Fix Under Review
  • Assignee deleted (Prashant D)
Actions

Also available in: Atom PDF