Project

General

Profile

Bug #44705

Prometheus metrics lint (and other) issues

Added by Karlis Mikelsons about 4 years ago. Updated almost 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
prometheus module
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

CEPH prometheus metrics fail to validate by promtool:

$ curl -s http://ceph1a:9283/metrics | promtool check metrics
ceph_bluefs_bytes_written_slow counter metrics should have "_total" suffix
ceph_bluefs_bytes_written_sst counter metrics should have "_total" suffix
ceph_bluefs_bytes_written_wal counter metrics should have "_total" suffix
ceph_bluefs_logged_bytes counter metrics should have "_total" suffix
ceph_bluefs_read_bytes counter metrics should have "_total" suffix
ceph_bluefs_read_prefetch_bytes counter metrics should have "_total" suffix
ceph_bluefs_read_random_buffer_bytes counter metrics should have "_total" suffix
ceph_bluefs_read_random_bytes counter metrics should have "_total" suffix
ceph_bluefs_read_random_disk_bytes counter metrics should have "_total" suffix
ceph_bluestore_commit_lat_count counter metrics should have "_total" suffix
ceph_bluestore_commit_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_bluestore_commit_lat_sum counter metrics should have "_total" suffix
ceph_bluestore_commit_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_bluestore_kv_final_lat_count counter metrics should have "_total" suffix
ceph_bluestore_kv_final_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_bluestore_kv_final_lat_sum counter metrics should have "_total" suffix
ceph_bluestore_kv_final_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_bluestore_kv_flush_lat_count counter metrics should have "_total" suffix
ceph_bluestore_kv_flush_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_bluestore_kv_flush_lat_sum counter metrics should have "_total" suffix
ceph_bluestore_kv_flush_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_bluestore_kv_sync_lat_count counter metrics should have "_total" suffix
ceph_bluestore_kv_sync_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_bluestore_kv_sync_lat_sum counter metrics should have "_total" suffix
ceph_bluestore_kv_sync_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_bluestore_read_lat_count counter metrics should have "_total" suffix
ceph_bluestore_read_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_bluestore_read_lat_sum counter metrics should have "_total" suffix
ceph_bluestore_read_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_bluestore_state_aio_wait_lat_count counter metrics should have "_total" suffix
ceph_bluestore_state_aio_wait_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_bluestore_state_aio_wait_lat_sum counter metrics should have "_total" suffix
ceph_bluestore_state_aio_wait_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_bluestore_submit_lat_count counter metrics should have "_total" suffix
ceph_bluestore_submit_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_bluestore_submit_lat_sum counter metrics should have "_total" suffix
ceph_bluestore_submit_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_bluestore_throttle_lat_count counter metrics should have "_total" suffix
ceph_bluestore_throttle_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_bluestore_throttle_lat_sum counter metrics should have "_total" suffix
ceph_bluestore_throttle_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_data_sync_from_fr5_fetch_bytes_count counter metrics should have "_total" suffix
ceph_data_sync_from_fr5_fetch_bytes_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_data_sync_from_fr5_fetch_bytes_sum counter metrics should have "_total" suffix
ceph_data_sync_from_fr5_fetch_bytes_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_data_sync_from_fr5_fetch_errors counter metrics should have "_total" suffix
ceph_data_sync_from_fr5_fetch_not_modified counter metrics should have "_total" suffix
ceph_data_sync_from_fr5_poll_errors counter metrics should have "_total" suffix
ceph_data_sync_from_fr5_poll_latency_count counter metrics should have "_total" suffix
ceph_data_sync_from_fr5_poll_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_data_sync_from_fr5_poll_latency_sum counter metrics should have "_total" suffix
ceph_data_sync_from_fr5_poll_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_mon_election_call counter metrics should have "_total" suffix
ceph_mon_election_lose counter metrics should have "_total" suffix
ceph_mon_election_win counter metrics should have "_total" suffix
ceph_mon_num_elections counter metrics should have "_total" suffix
ceph_mon_session_add counter metrics should have "_total" suffix
ceph_mon_session_rm counter metrics should have "_total" suffix
ceph_mon_session_trim counter metrics should have "_total" suffix
ceph_objecter_0x55bbe3862310_op_r counter metrics should have "_total" suffix
ceph_objecter_0x55bbe3862310_op_rmw counter metrics should have "_total" suffix
ceph_objecter_0x55bbe3862310_op_w counter metrics should have "_total" suffix
ceph_objecter_0x55bbe38633b0_op_r counter metrics should have "_total" suffix
ceph_objecter_0x55bbe38633b0_op_rmw counter metrics should have "_total" suffix
ceph_objecter_0x55bbe38633b0_op_w counter metrics should have "_total" suffix
ceph_objecter_0x55c7836545b0_op_r counter metrics should have "_total" suffix
ceph_objecter_0x55c7836545b0_op_rmw counter metrics should have "_total" suffix
ceph_objecter_0x55c7836545b0_op_w counter metrics should have "_total" suffix
ceph_objecter_0x55c786f91650_op_r counter metrics should have "_total" suffix
ceph_objecter_0x55c786f91650_op_rmw counter metrics should have "_total" suffix
ceph_objecter_0x55c786f91650_op_w counter metrics should have "_total" suffix
ceph_objecter_0x55e899b8a3f0_op_r counter metrics should have "_total" suffix
ceph_objecter_0x55e899b8a3f0_op_rmw counter metrics should have "_total" suffix
ceph_objecter_0x55e899b8a3f0_op_w counter metrics should have "_total" suffix
ceph_objecter_0x55e89b2ae7e0_op_r counter metrics should have "_total" suffix
ceph_objecter_0x55e89b2ae7e0_op_rmw counter metrics should have "_total" suffix
ceph_objecter_0x55e89b2ae7e0_op_w counter metrics should have "_total" suffix
ceph_objecter_op_r counter metrics should have "_total" suffix
ceph_objecter_op_rmw counter metrics should have "_total" suffix
ceph_objecter_op_w counter metrics should have "_total" suffix
ceph_osd_apply_latency_ms metric names should not contain abbreviated units
ceph_osd_commit_latency_ms metric names should not contain abbreviated units
ceph_osd_op counter metrics should have "_total" suffix
ceph_osd_op_in_bytes counter metrics should have "_total" suffix
ceph_osd_op_latency_count counter metrics should have "_total" suffix
ceph_osd_op_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_out_bytes counter metrics should have "_total" suffix
ceph_osd_op_prepare_latency_count counter metrics should have "_total" suffix
ceph_osd_op_prepare_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_prepare_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_prepare_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_process_latency_count counter metrics should have "_total" suffix
ceph_osd_op_process_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_process_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_process_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_r counter metrics should have "_total" suffix
ceph_osd_op_r_latency_count counter metrics should have "_total" suffix
ceph_osd_op_r_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_r_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_r_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_r_out_bytes counter metrics should have "_total" suffix
ceph_osd_op_r_prepare_latency_count counter metrics should have "_total" suffix
ceph_osd_op_r_prepare_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_r_prepare_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_r_prepare_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_r_process_latency_count counter metrics should have "_total" suffix
ceph_osd_op_r_process_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_r_process_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_r_process_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_rw counter metrics should have "_total" suffix
ceph_osd_op_rw_in_bytes counter metrics should have "_total" suffix
ceph_osd_op_rw_latency_count counter metrics should have "_total" suffix
ceph_osd_op_rw_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_rw_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_rw_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_rw_out_bytes counter metrics should have "_total" suffix
ceph_osd_op_rw_prepare_latency_count counter metrics should have "_total" suffix
ceph_osd_op_rw_prepare_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_rw_prepare_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_rw_prepare_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_rw_process_latency_count counter metrics should have "_total" suffix
ceph_osd_op_rw_process_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_rw_process_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_rw_process_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_w counter metrics should have "_total" suffix
ceph_osd_op_w_in_bytes counter metrics should have "_total" suffix
ceph_osd_op_w_latency_count counter metrics should have "_total" suffix
ceph_osd_op_w_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_w_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_w_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_w_prepare_latency_count counter metrics should have "_total" suffix
ceph_osd_op_w_prepare_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_w_prepare_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_w_prepare_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_op_w_process_latency_count counter metrics should have "_total" suffix
ceph_osd_op_w_process_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_osd_op_w_process_latency_sum counter metrics should have "_total" suffix
ceph_osd_op_w_process_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_osd_recovery_bytes counter metrics should have "_total" suffix
ceph_osd_recovery_ops counter metrics should have "_total" suffix
ceph_paxos_accept_timeout counter metrics should have "_total" suffix
ceph_paxos_begin counter metrics should have "_total" suffix
ceph_paxos_begin_bytes_count counter metrics should have "_total" suffix
ceph_paxos_begin_bytes_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_begin_bytes_sum counter metrics should have "_total" suffix
ceph_paxos_begin_bytes_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_begin_keys_count counter metrics should have "_total" suffix
ceph_paxos_begin_keys_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_begin_keys_sum counter metrics should have "_total" suffix
ceph_paxos_begin_keys_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_begin_latency_count counter metrics should have "_total" suffix
ceph_paxos_begin_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_begin_latency_sum counter metrics should have "_total" suffix
ceph_paxos_begin_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_collect counter metrics should have "_total" suffix
ceph_paxos_collect_bytes_count counter metrics should have "_total" suffix
ceph_paxos_collect_bytes_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_collect_bytes_sum counter metrics should have "_total" suffix
ceph_paxos_collect_bytes_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_collect_keys_count counter metrics should have "_total" suffix
ceph_paxos_collect_keys_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_collect_keys_sum counter metrics should have "_total" suffix
ceph_paxos_collect_keys_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_collect_latency_count counter metrics should have "_total" suffix
ceph_paxos_collect_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_collect_latency_sum counter metrics should have "_total" suffix
ceph_paxos_collect_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_collect_timeout counter metrics should have "_total" suffix
ceph_paxos_collect_uncommitted counter metrics should have "_total" suffix
ceph_paxos_commit counter metrics should have "_total" suffix
ceph_paxos_commit_bytes_count counter metrics should have "_total" suffix
ceph_paxos_commit_bytes_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_commit_bytes_sum counter metrics should have "_total" suffix
ceph_paxos_commit_bytes_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_commit_keys_count counter metrics should have "_total" suffix
ceph_paxos_commit_keys_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_commit_keys_sum counter metrics should have "_total" suffix
ceph_paxos_commit_keys_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_commit_latency_count counter metrics should have "_total" suffix
ceph_paxos_commit_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_commit_latency_sum counter metrics should have "_total" suffix
ceph_paxos_commit_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_lease_ack_timeout counter metrics should have "_total" suffix
ceph_paxos_lease_timeout counter metrics should have "_total" suffix
ceph_paxos_new_pn counter metrics should have "_total" suffix
ceph_paxos_new_pn_latency_count counter metrics should have "_total" suffix
ceph_paxos_new_pn_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_new_pn_latency_sum counter metrics should have "_total" suffix
ceph_paxos_new_pn_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_refresh counter metrics should have "_total" suffix
ceph_paxos_refresh_latency_count counter metrics should have "_total" suffix
ceph_paxos_refresh_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_refresh_latency_sum counter metrics should have "_total" suffix
ceph_paxos_refresh_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_restart counter metrics should have "_total" suffix
ceph_paxos_share_state counter metrics should have "_total" suffix
ceph_paxos_share_state_bytes_count counter metrics should have "_total" suffix
ceph_paxos_share_state_bytes_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_share_state_bytes_sum counter metrics should have "_total" suffix
ceph_paxos_share_state_bytes_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_share_state_keys_count counter metrics should have "_total" suffix
ceph_paxos_share_state_keys_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_share_state_keys_sum counter metrics should have "_total" suffix
ceph_paxos_share_state_keys_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_start_leader counter metrics should have "_total" suffix
ceph_paxos_start_peon counter metrics should have "_total" suffix
ceph_paxos_store_state counter metrics should have "_total" suffix
ceph_paxos_store_state_bytes_count counter metrics should have "_total" suffix
ceph_paxos_store_state_bytes_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_store_state_bytes_sum counter metrics should have "_total" suffix
ceph_paxos_store_state_bytes_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_store_state_keys_count counter metrics should have "_total" suffix
ceph_paxos_store_state_keys_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_store_state_keys_sum counter metrics should have "_total" suffix
ceph_paxos_store_state_keys_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_paxos_store_state_latency_count counter metrics should have "_total" suffix
ceph_paxos_store_state_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_paxos_store_state_latency_sum counter metrics should have "_total" suffix
ceph_paxos_store_state_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_pg_total non-counter metrics should not have "_total" suffix
ceph_pool_recovering_bytes_per_sec metric names should not contain abbreviated units
ceph_pool_recovering_keys_per_sec metric names should not contain abbreviated units
ceph_pool_recovering_objects_per_sec metric names should not contain abbreviated units
ceph_prioritycache:data_committed_bytes metric names should not contain ':'
ceph_prioritycache:data_pri0_bytes metric names should not contain ':'
ceph_prioritycache:data_pri10_bytes metric names should not contain ':'
ceph_prioritycache:data_pri11_bytes metric names should not contain ':'
ceph_prioritycache:data_pri1_bytes metric names should not contain ':'
ceph_prioritycache:data_pri2_bytes metric names should not contain ':'
ceph_prioritycache:data_pri3_bytes metric names should not contain ':'
ceph_prioritycache:data_pri4_bytes metric names should not contain ':'
ceph_prioritycache:data_pri5_bytes metric names should not contain ':'
ceph_prioritycache:data_pri6_bytes metric names should not contain ':'
ceph_prioritycache:data_pri7_bytes metric names should not contain ':'
ceph_prioritycache:data_pri8_bytes metric names should not contain ':'
ceph_prioritycache:data_pri9_bytes metric names should not contain ':'
ceph_prioritycache:data_reserved_bytes metric names should not contain ':'
ceph_prioritycache:full_committed_bytes metric names should not contain ':'
ceph_prioritycache:full_pri0_bytes metric names should not contain ':'
ceph_prioritycache:full_pri10_bytes metric names should not contain ':'
ceph_prioritycache:full_pri11_bytes metric names should not contain ':'
ceph_prioritycache:full_pri1_bytes metric names should not contain ':'
ceph_prioritycache:full_pri2_bytes metric names should not contain ':'
ceph_prioritycache:full_pri3_bytes metric names should not contain ':'
ceph_prioritycache:full_pri4_bytes metric names should not contain ':'
ceph_prioritycache:full_pri5_bytes metric names should not contain ':'
ceph_prioritycache:full_pri6_bytes metric names should not contain ':'
ceph_prioritycache:full_pri7_bytes metric names should not contain ':'
ceph_prioritycache:full_pri8_bytes metric names should not contain ':'
ceph_prioritycache:full_pri9_bytes metric names should not contain ':'
ceph_prioritycache:full_reserved_bytes metric names should not contain ':'
ceph_prioritycache:inc_committed_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri0_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri10_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri11_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri1_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri2_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri3_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri4_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri5_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri6_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri7_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri8_bytes metric names should not contain ':'
ceph_prioritycache:inc_pri9_bytes metric names should not contain ':'
ceph_prioritycache:inc_reserved_bytes metric names should not contain ':'
ceph_prioritycache:kv_committed_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri0_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri10_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri11_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri1_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri2_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri3_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri4_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri5_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri6_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri7_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri8_bytes metric names should not contain ':'
ceph_prioritycache:kv_pri9_bytes metric names should not contain ':'
ceph_prioritycache:kv_reserved_bytes metric names should not contain ':'
ceph_prioritycache:meta_committed_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri0_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri10_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri11_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri1_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri2_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri3_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri4_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri5_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri6_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri7_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri8_bytes metric names should not contain ':'
ceph_prioritycache:meta_pri9_bytes metric names should not contain ':'
ceph_prioritycache:meta_reserved_bytes metric names should not contain ':'
ceph_rgw_cache_hit counter metrics should have "_total" suffix
ceph_rgw_cache_miss counter metrics should have "_total" suffix
ceph_rgw_failed_req counter metrics should have "_total" suffix
ceph_rgw_gc_retire_object counter metrics should have "_total" suffix
ceph_rgw_get counter metrics should have "_total" suffix
ceph_rgw_get_b counter metrics should have "_total" suffix
ceph_rgw_get_b metric names should not contain abbreviated units
ceph_rgw_get_initial_lat_count counter metrics should have "_total" suffix
ceph_rgw_get_initial_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rgw_get_initial_lat_sum counter metrics should have "_total" suffix
ceph_rgw_get_initial_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rgw_keystone_token_cache_hit counter metrics should have "_total" suffix
ceph_rgw_keystone_token_cache_miss counter metrics should have "_total" suffix
ceph_rgw_pubsub_event_lost counter metrics should have "_total" suffix
ceph_rgw_pubsub_event_triggered counter metrics should have "_total" suffix
ceph_rgw_pubsub_missing_conf counter metrics should have "_total" suffix
ceph_rgw_pubsub_push_failed counter metrics should have "_total" suffix
ceph_rgw_pubsub_push_ok counter metrics should have "_total" suffix
ceph_rgw_pubsub_store_fail counter metrics should have "_total" suffix
ceph_rgw_pubsub_store_ok counter metrics should have "_total" suffix
ceph_rgw_put counter metrics should have "_total" suffix
ceph_rgw_put_b counter metrics should have "_total" suffix
ceph_rgw_put_b metric names should not contain abbreviated units
ceph_rgw_put_initial_lat_count counter metrics should have "_total" suffix
ceph_rgw_put_initial_lat_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rgw_put_initial_lat_sum counter metrics should have "_total" suffix
ceph_rgw_put_initial_lat_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rgw_req counter metrics should have "_total" suffix
ceph_rocksdb_compact counter metrics should have "_total" suffix
ceph_rocksdb_compact_queue_merge counter metrics should have "_total" suffix
ceph_rocksdb_compact_range counter metrics should have "_total" suffix
ceph_rocksdb_get counter metrics should have "_total" suffix
ceph_rocksdb_get_latency_count counter metrics should have "_total" suffix
ceph_rocksdb_get_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rocksdb_get_latency_sum counter metrics should have "_total" suffix
ceph_rocksdb_get_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rocksdb_rocksdb_write_delay_time_count counter metrics should have "_total" suffix
ceph_rocksdb_rocksdb_write_delay_time_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rocksdb_rocksdb_write_delay_time_sum counter metrics should have "_total" suffix
ceph_rocksdb_rocksdb_write_delay_time_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rocksdb_rocksdb_write_memtable_time_count counter metrics should have "_total" suffix
ceph_rocksdb_rocksdb_write_memtable_time_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rocksdb_rocksdb_write_memtable_time_sum counter metrics should have "_total" suffix
ceph_rocksdb_rocksdb_write_memtable_time_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rocksdb_rocksdb_write_pre_and_post_time_count counter metrics should have "_total" suffix
ceph_rocksdb_rocksdb_write_pre_and_post_time_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rocksdb_rocksdb_write_pre_and_post_time_sum counter metrics should have "_total" suffix
ceph_rocksdb_rocksdb_write_pre_and_post_time_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rocksdb_rocksdb_write_wal_time_count counter metrics should have "_total" suffix
ceph_rocksdb_rocksdb_write_wal_time_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rocksdb_rocksdb_write_wal_time_sum counter metrics should have "_total" suffix
ceph_rocksdb_rocksdb_write_wal_time_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rocksdb_submit_latency_count counter metrics should have "_total" suffix
ceph_rocksdb_submit_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rocksdb_submit_latency_sum counter metrics should have "_total" suffix
ceph_rocksdb_submit_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rocksdb_submit_sync_latency_count counter metrics should have "_total" suffix
ceph_rocksdb_submit_sync_latency_count non-histogram and non-summary metrics should not have "_count" suffix
ceph_rocksdb_submit_sync_latency_sum counter metrics should have "_total" suffix
ceph_rocksdb_submit_sync_latency_sum non-histogram and non-summary metrics should not have "_sum" suffix
ceph_rocksdb_submit_transaction counter metrics should have "_total" suffix
ceph_rocksdb_submit_transaction_sync counter metrics should have "_total" suffix

Additionally to this there are also other issues:
1) it is generally recommended to report time in seconds instead of ms:

$ curl -s http://ceph1a:9283/metrics | grep ms                      
# HELP ceph_osd_commit_latency_ms OSD stat commit_latency_ms
# TYPE ceph_osd_commit_latency_ms gauge
ceph_osd_commit_latency_ms{ceph_daemon="osd.1"} 2.0
ceph_osd_commit_latency_ms{ceph_daemon="osd.2"} 1.0
ceph_osd_commit_latency_ms{ceph_daemon="osd.0"} 1.0
# HELP ceph_osd_apply_latency_ms OSD stat apply_latency_ms
# TYPE ceph_osd_apply_latency_ms gauge
ceph_osd_apply_latency_ms{ceph_daemon="osd.1"} 2.0
ceph_osd_apply_latency_ms{ceph_daemon="osd.2"} 1.0
ceph_osd_apply_latency_ms{ceph_daemon="osd.0"} 1.0

2) some data that is supposed to be in metric labels is in metic name, which essentially means that it impossible to create proper alerts or graphs without hardcoding some of the stuff in (in given example "fr5" should be stored as label):

$ curl -s http://ceph1a:9283/metrics | grep fr5
# HELP ceph_data_sync_from_fr5_poll_latency_sum Average latency of replication log requests Total
# TYPE ceph_data_sync_from_fr5_poll_latency_sum counter
ceph_data_sync_from_fr5_poll_latency_sum{ceph_daemon="rgw.ceph1b.rgw0"} 4183.029349828
ceph_data_sync_from_fr5_poll_latency_sum{ceph_daemon="rgw.ceph1a.rgw0"} 35701.980564352
ceph_data_sync_from_fr5_poll_latency_sum{ceph_daemon="rgw.ceph1c.rgw0"} 55552.895098189
# HELP ceph_data_sync_from_fr5_fetch_errors Number of object replication errors
# TYPE ceph_data_sync_from_fr5_fetch_errors counter
ceph_data_sync_from_fr5_fetch_errors{ceph_daemon="rgw.ceph1b.rgw0"} 0.0
ceph_data_sync_from_fr5_fetch_errors{ceph_daemon="rgw.ceph1a.rgw0"} 0.0
ceph_data_sync_from_fr5_fetch_errors{ceph_daemon="rgw.ceph1c.rgw0"} 0.0
# HELP ceph_data_sync_from_fr5_fetch_not_modified Number of objects already replicated
# TYPE ceph_data_sync_from_fr5_fetch_not_modified counter
ceph_data_sync_from_fr5_fetch_not_modified{ceph_daemon="rgw.ceph1b.rgw0"} 0.0
ceph_data_sync_from_fr5_fetch_not_modified{ceph_daemon="rgw.ceph1a.rgw0"} 0.0
ceph_data_sync_from_fr5_fetch_not_modified{ceph_daemon="rgw.ceph1c.rgw0"} 0.0
# HELP ceph_data_sync_from_fr5_poll_errors Number of replication log request errors
# TYPE ceph_data_sync_from_fr5_poll_errors counter
ceph_data_sync_from_fr5_poll_errors{ceph_daemon="rgw.ceph1b.rgw0"} 0.0
ceph_data_sync_from_fr5_poll_errors{ceph_daemon="rgw.ceph1a.rgw0"} 0.0
ceph_data_sync_from_fr5_poll_errors{ceph_daemon="rgw.ceph1c.rgw0"} 0.0
# HELP ceph_data_sync_from_fr5_poll_latency_count Average latency of replication log requests Count
# TYPE ceph_data_sync_from_fr5_poll_latency_count counter
ceph_data_sync_from_fr5_poll_latency_count{ceph_daemon="rgw.ceph1b.rgw0"} 111090.0
ceph_data_sync_from_fr5_poll_latency_count{ceph_daemon="rgw.ceph1a.rgw0"} 809370.0
ceph_data_sync_from_fr5_poll_latency_count{ceph_daemon="rgw.ceph1c.rgw0"} 1110899.0
# HELP ceph_data_sync_from_fr5_fetch_bytes_sum Number of object bytes replicated Total
# TYPE ceph_data_sync_from_fr5_fetch_bytes_sum counter
ceph_data_sync_from_fr5_fetch_bytes_sum{ceph_daemon="rgw.ceph1b.rgw0"} 0.0
ceph_data_sync_from_fr5_fetch_bytes_sum{ceph_daemon="rgw.ceph1a.rgw0"} 0.0
ceph_data_sync_from_fr5_fetch_bytes_sum{ceph_daemon="rgw.ceph1c.rgw0"} 0.0
# HELP ceph_data_sync_from_fr5_fetch_bytes_count Number of object bytes replicated Count
# TYPE ceph_data_sync_from_fr5_fetch_bytes_count counter
ceph_data_sync_from_fr5_fetch_bytes_count{ceph_daemon="rgw.ceph1b.rgw0"} 0.0
ceph_data_sync_from_fr5_fetch_bytes_count{ceph_daemon="rgw.ceph1a.rgw0"} 0.0
ceph_data_sync_from_fr5_fetch_bytes_count{ceph_daemon="rgw.ceph1c.rgw0"} 0.0

History

#1 Updated by Neha Ojha about 4 years ago

Boris, can you take a look at this when you have a chance?

#2 Updated by Jan Fajerski almost 4 years ago

Some of these issues should definitely be fixed (':' should actually already be replaced).

Many of the other issues, like exporting ms instead of s, exporting total counts and sums or having metrics that could be unified through labels are unlikely to be fixed. Most metrics the exporter provides are automatically extracted from ceph's perfcounters. Some of these can probably be improved by being smarter about metrics naming based on derived types, but given that the perfcounters tend to change, I don't think anybody is going to put much manual effort into getting closer to prometheus guidelines.

Also available in: Atom PDF