Project

General

Profile

Bug #19782

pgmap and mgrmap keep increasing with information about ceph-mgr

Added by Wido den Hollander over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
04/26/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

I just upgraded a (test) env from Jewel to Luminous and that started the ceph-mgr processes on my ceph-mon servers.

I am now seeing this:

2017-04-26 17:14:35.452994 mon.0 [INF] pgmap v9666: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 685 B/s rd, 137 B/s wr, 0 op/s
2017-04-26 17:14:36.481421 mon.0 [INF] pgmap v9667: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 2331 B/s rd, 274 B/s wr, 2 op/s
2017-04-26 17:14:37.542212 mon.0 [INF] pgmap v9668: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 2754 B/s rd, 275 B/s wr, 2 op/s
2017-04-26 17:14:38.610408 mon.0 [INF] pgmap v9669: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 5735 B/s rd, 327 B/s wr, 5 op/s
2017-04-26 17:14:38.610647 mon.0 [INF] mgrmap e6673: active: alpha standbys: charlie, bravo
2017-04-26 17:14:39.648229 mon.0 [INF] pgmap v9670: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 6200 B/s rd, 326 B/s wr, 6 op/s
2017-04-26 17:14:39.648725 mon.0 [INF] mgrmap e6674: active: alpha standbys: charlie, bravo
2017-04-26 17:14:41.680825 mon.0 [INF] pgmap v9671: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 5344 B/s rd, 1125 B/s wr, 5 op/s
2017-04-26 17:14:42.714070 mon.0 [INF] mgrmap e6675: active: alpha standbys: charlie, bravo
2017-04-26 17:14:42.714159 mon.0 [INF] pgmap v9672: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 4654 B/s rd, 987 B/s wr, 4 op/s
2017-04-26 17:14:43.758725 mon.0 [INF] pgmap v9673: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 2959 B/s rd, 845 B/s wr, 3 op/s
2017-04-26 17:14:43.758796 mon.0 [INF] mgrmap e6676: active: alpha standbys: charlie, bravo
2017-04-26 17:14:44.780777 mon.0 [INF] mgrmap e6677: active: alpha standbys: charlie, bravo
2017-04-26 17:14:44.802319 mon.0 [INF] pgmap v9674: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 2534 B/s rd, 844 B/s wr, 2 op/s
2017-04-26 17:14:46.859721 mon.0 [INF] pgmap v9675: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 370 B/s rd, 741 B/s wr, 0 op/s
2017-04-26 17:14:47.880596 mon.0 [INF] pgmap v9676: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 744 B/s wr, 0 op/s
2017-04-26 17:14:47.880621 mon.0 [INF] mgrmap e6678: active: alpha standbys: charlie, bravo
2017-04-26 17:14:48.978156 mon.0 [INF] pgmap v9677: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail
2017-04-26 17:14:48.978376 mon.0 [INF] mgrmap e6679: active: alpha standbys: charlie, bravo
2017-04-26 17:14:50.046216 mon.0 [INF] pgmap v9678: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 140 B/s wr, 0 op/s
2017-04-26 17:14:51.145208 mon.0 [INF] pgmap v9679: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 418 B/s wr, 0 op/s
2017-04-26 17:14:53.224846 mon.0 [INF] pgmap v9680: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 366 B/s wr, 0 op/s
2017-04-26 17:14:53.253998 mon.0 [INF] mgrmap e6680: active: alpha standbys: charlie, bravo
2017-04-26 17:14:54.285503 mon.0 [INF] mgrmap e6681: active: alpha standbys: charlie, bravo
2017-04-26 17:14:54.325948 mon.0 [INF] pgmap v9681: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 413 B/s wr, 0 op/s
2017-04-26 17:14:56.325080 mon.0 [INF] pgmap v9682: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 363 B/s wr, 0 op/s
2017-04-26 17:14:57.066086 mon.0 [INF] HEALTH_WARN; clock skew detected on mon.bravo, mon.charlie; Monitor clock skew detected 
2017-04-26 17:14:57.374858 mon.0 [INF] pgmap v9683: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 363 B/s wr, 0 op/s
2017-04-26 17:14:58.447169 mon.0 [INF] mgrmap e6682: active: alpha standbys: charlie, bravo
2017-04-26 17:14:58.447343 mon.0 [INF] pgmap v9684: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail; 243 B/s wr, 0 op/s
2017-04-26 17:14:59.517296 mon.0 [INF] pgmap v9685: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail
2017-04-26 17:14:59.520918 mon.0 [INF] mgrmap e6683: active: alpha standbys: charlie, bravo
2017-04-26 17:15:00.525849 mon.0 [INF] pgmap v9686: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail
2017-04-26 17:15:02.580717 mon.0 [INF] pgmap v9687: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail
2017-04-26 17:15:03.612319 mon.0 [INF] mgrmap e6684: active: alpha standbys: charlie, bravo
2017-04-26 17:15:03.612406 mon.0 [INF] pgmap v9688: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail
2017-04-26 17:15:04.698350 mon.0 [INF] pgmap v9689: 592 pgs: 592 active+clean; 2313 MB data, 8404 MB used, 91491 MB / 99895 MB avail
2017-04-26 17:15:04.699788 mon.0 [INF] mgrmap e6685: active: alpha standbys: charlie, bravo

This keeps going on and never stops.

The version of Ceph:

ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e)

I set 'debug_mgr = 20' and tried again and this is what I saw:

2017-04-26 17:16:52.673394 7fb687156700  1 mgr send_beacon active
2017-04-26 17:16:52.673443 7fb687156700 10 mgr send_beacon sending beacon as gid 144123
2017-04-26 17:16:53.202456 7fb685953700  4 mgr.server handle_report from 0x55780a128800 name 1
2017-04-26 17:16:53.203314 7fb685953700 20 mgr.server handle_report updating existing DaemonState for 1
2017-04-26 17:16:53.203913 7fb685953700 20 mgr update loading 0 new types, 0 old types, had 778 types, got 5398 bytes of data
2017-04-26 17:16:53.237267 7fb685953700  4 mgr.server handle_report from 0x557809faa800 name charlie
2017-04-26 17:16:53.238513 7fb685953700 20 mgr.server handle_report updating existing DaemonState for charlie
2017-04-26 17:16:53.239837 7fb685953700 20 mgr update loading 0 new types, 0 old types, had 778 types, got 1558 bytes of data
2017-04-26 17:16:53.266257 7fb685953700  4 mgr.server handle_report from 0x557809f71000 name charlie
2017-04-26 17:16:53.267244 7fb685953700 20 mgr.server handle_report updating existing DaemonState for charlie
2017-04-26 17:16:53.267887 7fb685953700 20 mgr update loading 0 new types, 0 old types, had 778 types, got 2006 bytes of data
2017-04-26 17:16:53.451550 7fb68915a700  4 mgr ms_dispatch active log(1 entries from seq 363 at 2017-04-26 17:16:52.414578) v1
2017-04-26 17:16:53.451561 7fb68915a700  4 mgr ms_dispatch log(1 entries from seq 363 at 2017-04-26 17:16:52.414578) v1
2017-04-26 17:16:53.451568 7fb68915a700 10 mgr notify_all notify_all: notify_all (clog)
2017-04-26 17:16:53.451598 7fb68915a700  4 mgr ms_dispatch active mgrmap(e 6728) v1
2017-04-26 17:16:53.451604 7fb68915a700  4 mgr handle_mgr_map received map epoch 6728
2017-04-26 17:16:53.451606 7fb68915a700  4 mgr handle_mgr_map active in map: 1 active is 144123
2017-04-26 17:16:53.451610 7fb68915a700 10 mgr handle_mgr_map I was already active
2017-04-26 17:16:53.476617 7fb685953700  4 mgr.server handle_report from 0x557809f6e000 name alpha
2017-04-26 17:16:53.476694 7fb685953700 20 mgr.server handle_report updating existing DaemonState for alpha
2017-04-26 17:16:53.476723 7fb685953700 20 mgr update loading 0 new types, 0 old types, had 778 types, got 1558 bytes of data
2017-04-26 17:16:53.519398 7fb685953700  4 mgr.server handle_report from 0x557809fa7800 name 5
2017-04-26 17:16:53.519637 7fb685953700 20 mgr.server handle_report updating existing DaemonState for 5
2017-04-26 17:16:53.519686 7fb685953700 20 mgr update loading 0 new types, 0 old types, had 778 types, got 4926 bytes of data
2017-04-26 17:16:53.848927 7fb685953700  4 mgr.server handle_report from 0x55780a250800 name 9
2017-04-26 17:16:53.849057 7fb685953700 20 mgr.server handle_report updating existing DaemonState for 9
2017-04-26 17:16:53.849110 7fb685953700 20 mgr update loading 0 new types, 0 old types, had 778 types, got 4926 bytes of data
2017-04-26 17:16:54.099747 7fb68915a700  4 mgr ms_dispatch active mgrdigest v1
2017-04-26 17:16:54.099755 7fb68915a700  4 mgr ms_dispatch mgrdigest v1
2017-04-26 17:16:54.099759 7fb68915a700 10 mgr handle_mgr_digest 807
2017-04-26 17:16:54.099763 7fb68915a700 10 mgr handle_mgr_digest 1576
2017-04-26 17:16:54.099803 7fb68915a700 10 mgr notify_all notify_all: notify_all mon_status
2017-04-26 17:16:54.099810 7fb68915a700 10 mgr notify_all notify_all: notify_all health
2017-04-26 17:16:54.099813 7fb68915a700 10 mgr notify_all notify_all: notify_all pg_summary
2017-04-26 17:16:54.099816 7fb68915a700 10 mgr handle_mgr_digest done.
2017-04-26 17:16:54.237151 7fb685953700  4 mgr.server handle_report from 0x557809f84000 name bravo
2017-04-26 17:16:54.237277 7fb685953700 20 mgr.server handle_report updating existing DaemonState for bravo
2017-04-26 17:16:54.237333 7fb685953700 20 mgr update loading 0 new types, 0 old types, had 778 types, got 1558 bytes of data

The manager daemon is all new to me, so I can't tell for sure what is going on here.

But I shouldn't be increasing the pgmap at that rate I assume nor should it increase the mgrmap.

History

#1 Updated by Greg Farnum over 2 years ago

  • Project changed from Ceph to mgr
  • Category deleted (110)

#2 Updated by Sage Weil over 2 years ago

  • Status changed from New to Closed

This is normal. The pgmap epochs now don't really mean anything (they are only periodically persisted in an abbreviated form). Also the log spam is gone (it was annoying).

Also available in: Atom PDF