Project

General

Profile

Actions

Bug #51433

closed

mgr spamming with repeated set pgp_num_actual while merging

Added by Dan van der Ster almost 3 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Manager (RADOS bits)
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While merging PGs our osdmaps are churning through ~2000 epochs per hour.
The osdmap diffs are empty:

# diff 302192.txt 302193.txt
1c1
< epoch 302192
---
> epoch 302193
4c4
< modified 2021-06-29 16:43:16.224508
---
> modified 2021-06-29 16:43:18.285394
25c25
< pool 35 'default.rgw.buckets.data' erasure size 6 min_size 5 crush_rule 3 object_hash rjenkins pg_num 4190 pgp_num 4170 pg_num_target 4096 pgp_num_target 4096 last_change 302192 lfor 0/300438/300436 flags hashpspool,nodelete,nosizechange stripe_width 4096 fast_read 1 application rgw
---
> pool 35 'default.rgw.buckets.data' erasure size 6 min_size 5 crush_rule 3 object_hash rjenkins pg_num 4190 pgp_num 4170 pg_num_target 4096 pgp_num_target 4096 last_change 302193 lfor 0/300438/300436 flags hashpspool,nodelete,nosizechange stripe_width 4096 fast_read 1 application rgw

I found that the mgr is spamming:

2021-06-29 16:59:15.756021 mon.cephgabe-mon-7bb8b9e3d5 (mon.2) 216166 : audit [INF] from='mgr.490803962 [xxx]:0/2660941' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:15.757416 mon.cephgabe0 (mon.0) 1134697 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:16.188506 mon.cephgabe0 (mon.0) 1134698 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd='[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]': finished
2021-06-29 16:59:17.819641 mon.cephgabe-mon-7bb8b9e3d5 (mon.2) 216167 : audit [INF] from='mgr.490803962 [xxx]:0/2660941' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:17.820362 mon.cephgabe0 (mon.0) 1134700 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:18.249184 mon.cephgabe0 (mon.0) 1134701 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd='[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]': finished
2021-06-29 16:59:19.166836 mon.cephgabe0 (mon.0) 1134703 : audit [DBG] from='client.? [xxx]:0/1791927115' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2021-06-29 16:59:19.831892 mon.cephgabe-mon-7bb8b9e3d5 (mon.2) 216168 : audit [INF] from='mgr.490803962 [xxx]:0/2660941' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:19.832719 mon.cephgabe0 (mon.0) 1134704 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:20.330562 mon.cephgabe0 (mon.0) 1134705 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd='[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]': finished
2021-06-29 16:59:20.590091 mon.cephgabe0 (mon.0) 1134707 : audit [DBG] from='client.? [xxx]:0/3061431364' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2021-06-29 16:59:21.868946 mon.cephgabe-mon-7bb8b9e3d5 (mon.2) 216169 : audit [INF] from='mgr.490803962 [xxx]:0/2660941' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:21.869712 mon.cephgabe0 (mon.0) 1134708 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:22.014923 mon.cephgabe0 (mon.0) 1134709 : audit [DBG] from='client.? [xxx]:0/1894286236' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2021-06-29 16:59:22.383554 mon.cephgabe0 (mon.0) 1134710 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd='[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]': finished
2021-06-29 16:59:23.441387 mon.cephgabe0 (mon.0) 1134712 : audit [DBG] from='client.? [xxx]:0/1242539268' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2021-06-29 16:59:23.891402 mon.cephgabe-mon-7bb8b9e3d5 (mon.2) 216170 : audit [INF] from='mgr.490803962 [xxx]:0/2660941' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 16:59:23.892950 mon.cephgabe0 (mon.0) 1134713 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
...
2021-06-29 17:17:24.082373 mon.cephgabe0 (mon.0) 1136634 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 17:17:24.578802 mon.cephgabe0 (mon.0) 1136635 : audit [DBG] from='client.? [xxx]:0/3837647218' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2021-06-29 17:17:25.012672 mon.cephgabe0 (mon.0) 1136636 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd='[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]': finished
2021-06-29 17:17:26.007230 mon.cephgabe0 (mon.0) 1136638 : audit [DBG] from='client.? [xxx]:0/4094418758' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2021-06-29 17:17:26.093214 mon.cephgabe-mon-7bb8b9e3d5 (mon.2) 216759 : audit [INF] from='mgr.490803962 [xxx]:0/2660941' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 17:17:26.093977 mon.cephgabe0 (mon.0) 1136639 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 17:17:27.044498 mon.cephgabe0 (mon.0) 1136640 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd='[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]': finished
2021-06-29 17:17:27.431356 mon.cephgabe0 (mon.0) 1136642 : audit [DBG] from='client.? [xxx]:0/2297050664' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2021-06-29 17:17:28.125837 mon.cephgabe-mon-7bb8b9e3d5 (mon.2) 216760 : audit [INF] from='mgr.490803962 [xxx]:0/2660941' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 17:17:28.126546 mon.cephgabe0 (mon.0) 1136643 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 17:17:28.861731 mon.cephgabe0 (mon.0) 1136644 : audit [DBG] from='client.? [xxx]:0/2133339951' entity='client.admin' cmd=[{"prefix": "status", "format": "json"}]: dispatch
2021-06-29 17:17:29.125072 mon.cephgabe0 (mon.0) 1136645 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd='[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]': finished
2021-06-29 17:17:30.141884 mon.cephgabe-mon-7bb8b9e3d5 (mon.2) 216761 : audit [INF] from='mgr.490803962 [xxx]:0/2660941' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch
2021-06-29 17:17:30.142422 mon.cephgabe0 (mon.0) 1136647 : audit [INF] from='mgr.490803962 ' entity='mgr.cephgabe-mon-7bb8b9e3d5' cmd=[{"prefix": "osd pool set", "pool": "default.rgw.buckets.data", "var": "pgp_num_actual", "val": "4170"}]: dispatch

It's a bug in DaemonServer::adjust_pgs


Related issues 3 (0 open3 closed)

Copied to RADOS - Backport #51496: octopus: mgr spamming with repeated set pgp_num_actual while mergingResolvedNeha OjhaActions
Copied to RADOS - Backport #51497: nautilus: mgr spamming with repeated set pgp_num_actual while mergingRejectedKonstantin ShalyginActions
Copied to RADOS - Backport #51498: pacific: mgr spamming with repeated set pgp_num_actual while mergingResolvedCory SnyderActions
Actions #1

Updated by Dan van der Ster almost 3 years ago

  • Backport set to nautilus,octopus,pacific
  • Pull request ID set to 42105
Actions #2

Updated by Neha Ojha almost 3 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Kefu Chai almost 3 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Backport Bot almost 3 years ago

  • Copied to Backport #51496: octopus: mgr spamming with repeated set pgp_num_actual while merging added
Actions #5

Updated by Backport Bot almost 3 years ago

  • Copied to Backport #51497: nautilus: mgr spamming with repeated set pgp_num_actual while merging added
Actions #6

Updated by Backport Bot almost 3 years ago

  • Copied to Backport #51498: pacific: mgr spamming with repeated set pgp_num_actual while merging added
Actions #7

Updated by Neha Ojha about 2 years ago

  • Status changed from Pending Backport to Resolved
  • Backport changed from nautilus,octopus,pacific to octopus,pacific

nautilus is EOL

Actions

Also available in: Atom PDF