prometheus module hangs recovery and command execution
On our cluster with 2000+ osds we observe high cpu usage by ceph-mgr, similar to https://tracker.ceph.com/issues/44495.
However additionally after less than 30min since ceph-mgr had been started, it start to noticeably slow command execution, like `ceph osd df` or even interrupt recovery process.
We confirmed cause by disabling prometheus plugin for couple of hours and despite high cpu usage there was no affect on command execution as long the module were disabled.
We already tried to increase prometheus scrape interval but it only mitigate issue for little longer.