Project

General

Profile

Bug #11569

mon: "pg ls" is broken

Added by Greg Farnum over 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
Start date:
05/07/2015
Due date:
% Done:

60%

Source:
other
Tags:
Backport:
hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

I got handed this:

Backtrace
=========
-- begin dump of recent events ---
     0> 2015-05-06 07:28:28.359165 7fad29cf5700 -1 *** Caught signal (Aborted) **
 in thread 7fad29cf5700

 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
 1: /usr/bin/ceph-mon() [0x9017e2]
 2: (()+0xf130) [0x7fad3005b130]
 3: (gsignal()+0x37) [0x7fad2ea755d7]
 4: (abort()+0x148) [0x7fad2ea76cc8]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fad2f3799b5]
 6: (()+0x5e926) [0x7fad2f377926]
 7: (()+0x5e953) [0x7fad2f377953]
 8: (()+0x5eb73) [0x7fad2f377b73]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27a) [0x7b361a]
 10: (PGMap::get_filtered_pg_stats(std::string&, long, long, bool, std::set<pg_t, std::less<pg_t>, std::allocator<pg_t> >&)+0x1d3) [0x885ea3]
 11: (PGMonitor::preprocess_command(MMonCommand*)+0x1ccd) [0x66398d]
 12: (PGMonitor::preprocess_query(PaxosServiceMessage*)+0x27f) [0x66584f]
 13: (PaxosService::dispatch(PaxosServiceMessage*)+0x833) [0x5cacd3]
 14: (Monitor::handle_command(MMonCommand*)+0x1549) [0x591b19]
 15: (Monitor::dispatch(MonSession*, Message*, bool)+0xf9) [0x594c89]
 16: (Monitor::_ms_dispatch(Message*)+0x1a6) [0x595936]
 17: (Monitor::ms_dispatch(Message*)+0x23) [0x5b5403]
 18: (DispatchQueue::entry()+0x64a) [0x8a1d9a]
 19: (DispatchQueue::DispatchThread::entry()+0xd) [0x79bd9d]
 20: (()+0x7df5) [0x7fad30053df5]
 21: (clone()+0x6d) [0x7fad2eb361ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

The only assert in the get_filtered_pg_stats function is a check that the "type" returned from osd_types::pg_string_state() is a valid one. It looks like pg_string_state() is actually just broken, as it checks for equality of state strings (e.g. "active", "clean") but not for inclusion ("active+clean") — which is how they're most commonly presented. I think this was functionality added for Calamari? But I haven't dug all the way into it so I could be missing something.


Related issues

Copied to Ceph - Backport #11910: mon: "pg ls" is broken Resolved 05/07/2015

Associated revisions

Revision 89f89ca3 (diff)
Added by Kefu Chai over 4 years ago

mon: s/recovery/recoverying/ in "pg ls*" commands' spec

  • also translate "repair" if specified as "states"
  • update test_mon_pg in cephtool-test-mon.sh

Fixes: #11569
Signed-off-by: Kefu Chai <>

Revision e310461a (diff)
Added by Kefu Chai about 4 years ago

mon: s/recovery/recoverying/ in "pg ls*" commands' spec

  • also translate "repair" if specified as "states"
  • update test_mon_pg in cephtool-test-mon.sh

Fixes: #11569
Signed-off-by: Kefu Chai <>
(cherry picked from commit 89f89ca3477eddcae11a05fbd58a8f3658eb1fc1)

Revision 963c5241 (diff)
Added by Kefu Chai about 4 years ago

doc: add changes to "pg ls*" commands to pending release notes

"pg ls*" commands now accept recovering instead of recovery as
one of its states.

Fixes: #11569
Signed-off-by: Kefu Chai <>

History

#1 Updated by Kefu Chai over 4 years ago

  • Status changed from New to Verified

yes, by inspecting the log file,

$ grep "pg ls" logs/*.log
logs/ceph-mon.mon1.log:2015-05-06 07:28:30.607083 7fd9237a5700  0 mon.mon1@0(leader) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon1.log:2015-05-06 07:28:30.607184 7fd9237a5700  0 log_channel(audit) log [DBG] : from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon1.log:    -8> 2015-05-06 07:28:30.606924 7fd9237a5700  1 -- 10.12.27.2:6789/0 <== client.4762 10.12.27.2:0/1024942 4 ==== mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1 ==== 102+0+0 (153034552 0 0) 0x6593800 con 0x660ce60
logs/ceph-mon.mon1.log:    -7> 2015-05-06 07:28:30.607083 7fd9237a5700  0 mon.mon1@0(leader) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon1.log:    -6> 2015-05-06 07:28:30.607184 7fd9237a5700  0 log_channel(audit) log [DBG] : from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon1.log:    -3> 2015-05-06 07:28:30.607214 7fd9237a5700 10 log_client  will send 2015-05-06 07:28:30.607195 mon.0 10.12.27.2:6789/0 2165 : audit [DBG] from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon2.log:2015-05-06 07:28:28.138983 7fad29cf5700  0 mon.mon2@1(peon) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon2.log:2015-05-06 07:28:28.139077 7fad29cf5700  0 log_channel(audit) log [DBG] : from='client.? 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon2.log:    -8> 2015-05-06 07:28:28.138779 7fad29cf5700  1 -- 10.12.27.3:6789/0 <== client.4762 10.12.27.2:0/1024942 8 ==== mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1 ==== 102+0+0 (153034552 0 0) 0x4522800 con 0x425dc20
logs/ceph-mon.mon2.log:    -7> 2015-05-06 07:28:28.138983 7fad29cf5700  0 mon.mon2@1(peon) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon2.log:    -6> 2015-05-06 07:28:28.139077 7fad29cf5700  0 log_channel(audit) log [DBG] : from='client.? 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon2.log:    -3> 2015-05-06 07:28:28.139100 7fad29cf5700 10 log_client  will send 2015-05-06 07:28:28.139085 mon.1 10.12.27.3:6789/0 245 : audit [DBG] from='client.? 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon3.log:2015-05-06 06:09:14.232803 7fcbea862700  0 mon.mon3@2(peon) e1 handle_command mon_command({"prefix": "pg ls-by-pool", "poolstr": "remapped"} v 0) v1
logs/ceph-mon.mon3.log:2015-05-06 06:09:14.232927 7fcbea862700  0 log_channel(audit) log [DBG] : from='client.? 10.12.27.2:0/1017585' entity='client.admin' cmd=[{"prefix": "pg ls-by-pool", "poolstr": "remapped"}]: dispatch
logs/ceph-mon.mon3.log:2015-05-06 06:09:24.683393 7fcbea862700  0 mon.mon3@2(peon) e1 handle_command mon_command({"states": ["remapped"], "prefix": "pg ls-by-pool", "poolstr": "testpool"} v 0) v1
logs/ceph-mon.mon3.log:2015-05-06 06:09:24.683518 7fcbea862700  0 log_channel(audit) log [DBG] : from='client.? 10.12.27.2:0/1017696' entity='client.admin' cmd=[{"states": ["remapped"], "prefix": "pg ls-by-pool", "poolstr": "testpool"}]: dispatch
logs/ceph-mon.mon3.log:2015-05-06 07:28:28.359575 7fcbea862700  0 mon.mon3@2(peon) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon3.log:2015-05-06 07:28:28.359682 7fcbea862700  0 log_channel(audit) log [DBG] : from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon3.log:    -8> 2015-05-06 07:28:28.359428 7fcbea862700  1 -- 10.12.27.5:6789/0 <== client.4762 10.12.27.2:0/1024942 4 ==== mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1 ==== 102+0+0 (153034552 0 0) 0x5a6dc00 con 0x7720f20
logs/ceph-mon.mon3.log:    -7> 2015-05-06 07:28:28.359575 7fcbea862700  0 mon.mon3@2(peon) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon3.log:    -6> 2015-05-06 07:28:28.359682 7fcbea862700  0 log_channel(audit) log [DBG] : from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon3.log:    -3> 2015-05-06 07:28:28.359718 7fcbea862700 10 log_client  will send 2015-05-06 07:28:28.359695 mon.2 10.12.27.5:6789/0 269 : audit [DBG] from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch

and we normalize these "pg ls-by-*" commands to "pg ls". and try to translate the "state" to the internal representation using pg_string_state(), which brings down the mon if the string is not recognized. in this case, "recovery" is listed in the MonCommands, but not understood by pg_string_state().

the root cause of this issue is the inconsistency in between the command spec and the implementation of the command.

#2 Updated by Kefu Chai over 4 years ago

  • Backport set to hammer

#3 Updated by Kefu Chai over 4 years ago

  • Status changed from Verified to Need Review
  • % Done changed from 0 to 60

#4 Updated by Greg Farnum over 4 years ago

Ah, so we're actually only querying on one kind of state at a time, but it was still broken. Coolio.

#5 Updated by Sage Weil about 4 years ago

  • Status changed from Need Review to Pending Backport

#7 Updated by Loic Dachary almost 4 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF