Project

General

Profile

Bug #13843

ceph osd pool stats broken in hammer

Added by Dan van der Ster about 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
hammer
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Since we updated from firefly to hammer the ceph osd pool stats is wrong when the cluster is degraded:

# ceph status --cluster beesly
    cluster b4f463a0-c671-43a8-bd36-e40ab8d233d2
     health HEALTH_WARN
            3 pgs backfill
            23 pgs backfilling
            26 pgs degraded
            26 pgs stuck degraded
            26 pgs stuck unclean
            26 pgs stuck undersized
            26 pgs undersized
            recovery 385262/264401079 objects degraded (0.146%)
            recovery 859811/264401079 objects misplaced (0.325%)
     monmap e23: 5 mons at {p01001532021656=128.142.35.220:6789/0,p01001532971954=128.142.36.229:6789/0,p05517715d82373=188.184.36.206:6789/0,p05517715y01595=188.184.36.166:6789/0,p05517715y58557=188.184.36.164:6789/0}
            election epoch 5302, quorum 0,1,2,3,4 p01001532021656,p01001532971954,p05517715y58557,p05517715y01595,p05517715d82373
     mdsmap e224: 1/1/1 up {0=0=up:active}
     osdmap e296250: 1200 osds: 1199 up, 1199 in; 26 remapped pgs
      pgmap v60417303: 20000 pgs, 22 pools, 339 TB data, 85969 kobjects
            1018 TB used, 2543 TB / 3562 TB avail
            385262/264401079 objects degraded (0.146%)
            859811/264401079 objects misplaced (0.325%)
               19974 active+clean
                  23 active+undersized+degraded+remapped+backfilling
                   3 active+undersized+degraded+remapped+wait_backfill
recovery io 395 MB/s, 98 objects/s
  client io 101 MB/s rd, 342 MB/s wr, 7794 op/s

# ceph osd pool stats --cluster=beesly
pool data id 0
  nothing is going on

pool metadata id 1
  nothing is going on

pool rbd id 2
  nothing is going on

pool volumes id 4
  -517/54 objects misplaced (-957.407%)
  recovery io 997 MB/s, 249 objects/s
  client io 64512 kB/s rd, 188 MB/s wr, 4820 op/s

pool images id 5
  -94/0 objects misplaced (-inf%)
  recovery io 118 MB/s, 14 objects/s
  client io 2617 kB/s rd, 0 op/s

pool .rgw.root id 17
  nothing is going on

pool .rgw.control id 18
  nothing is going on

pool .users.uid id 21
  nothing is going on

pool .users.email id 22
  nothing is going on

pool .users id 23
  nothing is going on

pool .rgw.buckets.index id 24
  client io 1312 B/s rd, 1 op/s

pool .usage id 56
  nothing is going on

pool .log id 57
  nothing is going on

pool .intent-log id 58
  nothing is going on

pool .rgw.gc id 59
  nothing is going on

pool .rgw id 60
  nothing is going on

pool .rgw.buckets id 63
  client io 988 kB/s wr, 1 op/s

pool test id 64
  nothing is going on

pool test.os id 65
  nothing is going on

pool .users.swift id 73
  nothing is going on

pool cinder-critical id 75
  client io 3256 kB/s rd, 10051 kB/s wr, 142 op/s

pool test.critical id 77
  nothing is going on

Related issues

Copied to Ceph - Backport #14288: hammer: ceph osd pool stats broken in hammer Resolved

History

#1 Updated by Dan van der Ster about 7 years ago

Correction, this only happens when there is ongoing backfilling. When the cluster is degraded but there's no backfilling the output is correct.

Also, this breaks the rest-api json output (see the "Error..." prefix):

# curl http://cephmon:5000/api/v0.1/osd/pool/stats.json
Error decoding JSON from [{"pool_name":"data","pool_id":0,"recovery":{"degraded_objects":924,"degraded_total":0,"degraded_ratio":inf},"recovery_rate":{},"client_io_rate":{}},{"pool_name":"metadata","pool_id":1,"recovery":{"degraded_objects":322,"degraded_total":0,"degraded_ratio":inf},"recovery_rate":{},"client_io_rate":{}},{"pool_name":"rbd","pool_id":2,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":"volumes","pool_id":4,"recovery":{},"recovery_rate":{},"client_io_rate":{"read_bytes_sec":77399277,"write_bytes_sec":146462513,"op_per_sec":6479}},{"pool_name":"images","pool_id":5,"recovery":{},"recovery_rate":{},"client_io_rate":{"read_bytes_sec":11224690,"op_per_sec":1}},{"pool_name":".rgw.root","pool_id":17,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".rgw.control","pool_id":18,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".users.uid","pool_id":21,"recovery":{},"recovery_rate":{},"client_io_rate":{"read_bytes_sec":245,"write_bytes_sec":0,"op_per_sec":0}},{"pool_name":".users.email","pool_id":22,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".users","pool_id":23,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".rgw.buckets.index","pool_id":24,"recovery":{},"recovery_rate":{},"client_io_rate":{"read_bytes_sec":8221,"write_bytes_sec":0,"op_per_sec":9}},{"pool_name":".usage","pool_id":56,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".log","pool_id":57,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".intent-log","pool_id":58,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".rgw.gc","pool_id":59,"recovery":{},"recovery_rate":{},"client_io_rate":{"read_bytes_sec":104,"write_bytes_sec":0,"op_per_sec":0}},{"pool_name":".rgw","pool_id":60,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".rgw.buckets","pool_id":63,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":"test","pool_id":64,"recovery":{"degraded_objects":13,"degraded_total":0,"degraded_ratio":inf},"recovery_rate":{},"client_io_rate":{}},{"pool_name":"test.os","pool_id":65,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":".users.swift","pool_id":73,"recovery":{},"recovery_rate":{},"client_io_rate":{}},{"pool_name":"cinder-critical","pool_id":75,"recovery":{},"recovery_rate":{},"client_io_rate":{"read_bytes_sec":1082089,"write_bytes_sec":4523171,"op_per_sec":113}},{"pool_name":"test.critical","pool_id":77,"recovery":{},"recovery_rate":{},"client_io_rate":{}}]

#2 Updated by Samuel Just about 7 years ago

  • Priority changed from High to Urgent

#3 Updated by Loïc Dachary almost 7 years ago

"degraded_ratio":inf

#4 Updated by Loïc Dachary almost 7 years ago

  • Backport set to infernalis,hammer,firefly

#5 Updated by Loïc Dachary almost 7 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to Loïc Dachary

#7 Updated by Loïc Dachary almost 7 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport changed from infernalis,hammer,firefly to hammer
  • Regression changed from Yes to No

#8 Updated by Loïc Dachary almost 7 years ago

  • Copied to Backport #14288: hammer: ceph osd pool stats broken in hammer added

#9 Updated by Loïc Dachary almost 7 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF