Project

General

Profile

Actions

Bug #22662

closed

ceph osd df json output validation reported invalid numbers (-nan) (jewel)

Added by Enrico Labedzki over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

100%

Source:
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

we have a monitoring script which parses the 'ceph osd df -f json' output, but from time to time it will happen, that one or more OSDs are down and the JSON object is then invalid.


Files


Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #22866: jewel: ceph osd df json output validation reported invalid numbers (-nan) (jewel)ResolvedPrashant DActions
Actions #1

Updated by Greg Farnum over 6 years ago

  • Project changed from Ceph to mgr
  • Category deleted (ceph cli)
Actions #2

Updated by Sage Weil over 6 years ago

  • Project changed from mgr to RADOS
  • Subject changed from ceph osd df json output validation reported invalid numbers to ceph osd df json output validation reported invalid numbers (-nan)
  • Status changed from New to 12
  • Priority changed from Normal to Urgent

1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use

Actions #3

Updated by Sage Weil over 6 years ago

  • Subject changed from ceph osd df json output validation reported invalid numbers (-nan) to ceph osd df json output validation reported invalid numbers (-nan) (jewel)
Actions #4

Updated by Nathan Cutler about 6 years ago

  • Backport set to jewel luminous
Actions #5

Updated by Chang Liu about 6 years ago

  • Assignee set to Chang Liu
Actions #6

Updated by Chang Liu about 6 years ago

This bug has been fixed by https://github.com/ceph/ceph/pull/13531. We should backport it to Jewel.

Actions #7

Updated by Chang Liu about 6 years ago

Sage Weil wrote:

1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use

when there is a NaN or Inf in JsonFormatter, what should we do? throw a exception directly ?

Updated by Enrico Labedzki about 6 years ago

Chang Liu wrote:

Sage Weil wrote:

1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use

when there is a NaN or Inf in JsonFormatter, what should we do? throw a exception directly ?

Why not simply set those values to zero as Sage mentioned before, which should be ok i think.

Those zero values can than be handled otherwise.

It will look like this (see attachment), so i can see all OSDs in down state.

Before we fixed this by ourself with a json validate (hack), the graphs looking this (did you see the gaps where is nothing), which isn't very helpful.

Actions #9

Updated by Chang Liu about 6 years ago

Enrico Labedzki wrote:

Chang Liu wrote:

Sage Weil wrote:

1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use

when there is a NaN or Inf in JsonFormatter, what should we do? throw a exception directly ?

Why not simply set those values to zero as Sage mentioned before, which should be ok i think.

Those zero values can than be handled otherwise.

It will look like this (see attachment), so i can see all OSDs in down state.

Before we fixed this by ourself with a json validate (hack), the graphs looking this (did you see the gaps where is nothing), which isn't very helpful.

thanks, I afraid that using zero as NaN/Inf is not a perfect solution. in some cases, 0 is a valid value(likes wr_io_rate), we will hide the true issue.

Actions #10

Updated by Enrico Labedzki about 6 years ago

Chang Liu wrote:

Enrico Labedzki wrote:

Chang Liu wrote:

Sage Weil wrote:

1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use

when there is a NaN or Inf in JsonFormatter, what should we do? throw a exception directly ?

Why not simply set those values to zero as Sage mentioned before, which should be ok i think.

Those zero values can than be handled otherwise.

It will look like this (see attachment), so i can see all OSDs in down state.

Before we fixed this by ourself with a json validate (hack), the graphs looking this (did you see the gaps where is nothing), which isn't very helpful.

thanks, I afraid that using zero as NaN/Inf is not a perfect solution. in some cases, 0 is a valid value(likes wr_io_rate), we will hide the true issue.

yes you are right and what is with -1 as value (indication), can this be a solution!?

Or did that also clash with with any values?

Actions #11

Updated by Chang Liu about 6 years ago

Enrico Labedzki wrote:

Chang Liu wrote:

Enrico Labedzki wrote:

Chang Liu wrote:

Sage Weil wrote:

1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use

when there is a NaN or Inf in JsonFormatter, what should we do? throw a exception directly ?

Why not simply set those values to zero as Sage mentioned before, which should be ok i think.

Those zero values can than be handled otherwise.

It will look like this (see attachment), so i can see all OSDs in down state.

Before we fixed this by ourself with a json validate (hack), the graphs looking this (did you see the gaps where is nothing), which isn't very helpful.

thanks, I afraid that using zero as NaN/Inf is not a perfect solution. in some cases, 0 is a valid value(likes wr_io_rate), we will hide the true issue.

yes you are right and what is with -1 as value (indication), can this be a solution!?

Or did that also clash with with any values?

I do not think using a normal integer as a invalid number is a good solution. in Python json.dumps function. it will raise a ValueError when there is a NaN/Inf number. and json.dumps has a param called allow_nan. json.dumps will dump NaN directly when allow_nan is True.

Actions #12

Updated by Enrico Labedzki about 6 years ago

Chang Liu wrote:

Enrico Labedzki wrote:

Chang Liu wrote:

Enrico Labedzki wrote:

Chang Liu wrote:

Sage Weil wrote:

1. it's not valid json.. Formatter shouldn't allow it
2. we should have a valid value (or 0) to use

when there is a NaN or Inf in JsonFormatter, what should we do? throw a exception directly ?

Why not simply set those values to zero as Sage mentioned before, which should be ok i think.

Those zero values can than be handled otherwise.

It will look like this (see attachment), so i can see all OSDs in down state.

Before we fixed this by ourself with a json validate (hack), the graphs looking this (did you see the gaps where is nothing), which isn't very helpful.

thanks, I afraid that using zero as NaN/Inf is not a perfect solution. in some cases, 0 is a valid value(likes wr_io_rate), we will hide the true issue.

yes you are right and what is with -1 as value (indication), can this be a solution!?

Or did that also clash with with any values?

I do not think using a normal integer as a invalid number is a good solution. in Python json.dumps function. it will raise a ValueError when there is a NaN/Inf number. and json.dumps has a param called allow_nan. json.dumps will dump NaN directly when allow_nan is True.

Ok you are right, maybe not a good choice to use integer values as error indicator.

I did read an gambled a little with python, perl and ruby JSON parsers (we mainly use ruby as our prefered language in my company).

As the json specification told, valid values are (string,number,object,array,true,false and null), so why not add null as value, which should work in python, perl and ruby and gives (perl undef, python None, ruby == nil) and can be handled by the programer and the best of it, the json object keeps intact, there is no need to raise a exception or something else.

What do you think!?

Actions #13

Updated by Nathan Cutler about 6 years ago

+1 for null, which is an English word and hence far more comprehensible than "NaN", which is what I would call "Programmer Slang".

"undef", "undefined", or "out of range" are other candidates (from a purely linguistic perspective)

Or (re-reading the bug description) possibly "error", "OSD down", "n/a" (which stands for "not applicable" or "not available")

Oh, wait - nevermind, this is just a request to backport https://github.com/ceph/ceph/pull/13531 to jewel. Marking appropriately.

Actions #14

Updated by Nathan Cutler about 6 years ago

  • Backport changed from jewel luminous to jewel
Actions #15

Updated by Nathan Cutler about 6 years ago

  • Status changed from 12 to Pending Backport
Actions #16

Updated by Nathan Cutler about 6 years ago

  • Copied to Backport #22866: jewel: ceph osd df json output validation reported invalid numbers (-nan) (jewel) added
Actions #17

Updated by Nathan Cutler about 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF