Project

General

Profile

Actions

Bug #11569

closed

mon: "pg ls" is broken

Added by Greg Farnum almost 9 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

60%

Source:
other
Tags:
Backport:
hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I got handed this:

Backtrace
=========
-- begin dump of recent events ---
     0> 2015-05-06 07:28:28.359165 7fad29cf5700 -1 *** Caught signal (Aborted) **
 in thread 7fad29cf5700

 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff)
 1: /usr/bin/ceph-mon() [0x9017e2]
 2: (()+0xf130) [0x7fad3005b130]
 3: (gsignal()+0x37) [0x7fad2ea755d7]
 4: (abort()+0x148) [0x7fad2ea76cc8]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fad2f3799b5]
 6: (()+0x5e926) [0x7fad2f377926]
 7: (()+0x5e953) [0x7fad2f377953]
 8: (()+0x5eb73) [0x7fad2f377b73]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27a) [0x7b361a]
 10: (PGMap::get_filtered_pg_stats(std::string&, long, long, bool, std::set<pg_t, std::less<pg_t>, std::allocator<pg_t> >&)+0x1d3) [0x885ea3]
 11: (PGMonitor::preprocess_command(MMonCommand*)+0x1ccd) [0x66398d]
 12: (PGMonitor::preprocess_query(PaxosServiceMessage*)+0x27f) [0x66584f]
 13: (PaxosService::dispatch(PaxosServiceMessage*)+0x833) [0x5cacd3]
 14: (Monitor::handle_command(MMonCommand*)+0x1549) [0x591b19]
 15: (Monitor::dispatch(MonSession*, Message*, bool)+0xf9) [0x594c89]
 16: (Monitor::_ms_dispatch(Message*)+0x1a6) [0x595936]
 17: (Monitor::ms_dispatch(Message*)+0x23) [0x5b5403]
 18: (DispatchQueue::entry()+0x64a) [0x8a1d9a]
 19: (DispatchQueue::DispatchThread::entry()+0xd) [0x79bd9d]
 20: (()+0x7df5) [0x7fad30053df5]
 21: (clone()+0x6d) [0x7fad2eb361ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

The only assert in the get_filtered_pg_stats function is a check that the "type" returned from osd_types::pg_string_state() is a valid one. It looks like pg_string_state() is actually just broken, as it checks for equality of state strings (e.g. "active", "clean") but not for inclusion ("active+clean") — which is how they're most commonly presented. I think this was functionality added for Calamari? But I haven't dug all the way into it so I could be missing something.


Related issues 1 (0 open1 closed)

Copied to Ceph - Backport #11910: mon: "pg ls" is brokenResolvedAbhishek Lekshmanan05/07/2015Actions
Actions #1

Updated by Kefu Chai almost 9 years ago

  • Status changed from New to 12

yes, by inspecting the log file,

$ grep "pg ls" logs/*.log
logs/ceph-mon.mon1.log:2015-05-06 07:28:30.607083 7fd9237a5700  0 mon.mon1@0(leader) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon1.log:2015-05-06 07:28:30.607184 7fd9237a5700  0 log_channel(audit) log [DBG] : from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon1.log:    -8> 2015-05-06 07:28:30.606924 7fd9237a5700  1 -- 10.12.27.2:6789/0 <== client.4762 10.12.27.2:0/1024942 4 ==== mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1 ==== 102+0+0 (153034552 0 0) 0x6593800 con 0x660ce60
logs/ceph-mon.mon1.log:    -7> 2015-05-06 07:28:30.607083 7fd9237a5700  0 mon.mon1@0(leader) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon1.log:    -6> 2015-05-06 07:28:30.607184 7fd9237a5700  0 log_channel(audit) log [DBG] : from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon1.log:    -3> 2015-05-06 07:28:30.607214 7fd9237a5700 10 log_client  will send 2015-05-06 07:28:30.607195 mon.0 10.12.27.2:6789/0 2165 : audit [DBG] from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon2.log:2015-05-06 07:28:28.138983 7fad29cf5700  0 mon.mon2@1(peon) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon2.log:2015-05-06 07:28:28.139077 7fad29cf5700  0 log_channel(audit) log [DBG] : from='client.? 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon2.log:    -8> 2015-05-06 07:28:28.138779 7fad29cf5700  1 -- 10.12.27.3:6789/0 <== client.4762 10.12.27.2:0/1024942 8 ==== mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1 ==== 102+0+0 (153034552 0 0) 0x4522800 con 0x425dc20
logs/ceph-mon.mon2.log:    -7> 2015-05-06 07:28:28.138983 7fad29cf5700  0 mon.mon2@1(peon) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon2.log:    -6> 2015-05-06 07:28:28.139077 7fad29cf5700  0 log_channel(audit) log [DBG] : from='client.? 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon2.log:    -3> 2015-05-06 07:28:28.139100 7fad29cf5700 10 log_client  will send 2015-05-06 07:28:28.139085 mon.1 10.12.27.3:6789/0 245 : audit [DBG] from='client.? 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon3.log:2015-05-06 06:09:14.232803 7fcbea862700  0 mon.mon3@2(peon) e1 handle_command mon_command({"prefix": "pg ls-by-pool", "poolstr": "remapped"} v 0) v1
logs/ceph-mon.mon3.log:2015-05-06 06:09:14.232927 7fcbea862700  0 log_channel(audit) log [DBG] : from='client.? 10.12.27.2:0/1017585' entity='client.admin' cmd=[{"prefix": "pg ls-by-pool", "poolstr": "remapped"}]: dispatch
logs/ceph-mon.mon3.log:2015-05-06 06:09:24.683393 7fcbea862700  0 mon.mon3@2(peon) e1 handle_command mon_command({"states": ["remapped"], "prefix": "pg ls-by-pool", "poolstr": "testpool"} v 0) v1
logs/ceph-mon.mon3.log:2015-05-06 06:09:24.683518 7fcbea862700  0 log_channel(audit) log [DBG] : from='client.? 10.12.27.2:0/1017696' entity='client.admin' cmd=[{"states": ["remapped"], "prefix": "pg ls-by-pool", "poolstr": "testpool"}]: dispatch
logs/ceph-mon.mon3.log:2015-05-06 07:28:28.359575 7fcbea862700  0 mon.mon3@2(peon) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon3.log:2015-05-06 07:28:28.359682 7fcbea862700  0 log_channel(audit) log [DBG] : from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon3.log:    -8> 2015-05-06 07:28:28.359428 7fcbea862700  1 -- 10.12.27.5:6789/0 <== client.4762 10.12.27.2:0/1024942 4 ==== mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1 ==== 102+0+0 (153034552 0 0) 0x5a6dc00 con 0x7720f20
logs/ceph-mon.mon3.log:    -7> 2015-05-06 07:28:28.359575 7fcbea862700  0 mon.mon3@2(peon) e1 handle_command mon_command({"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2} v 0) v1
logs/ceph-mon.mon3.log:    -6> 2015-05-06 07:28:28.359682 7fcbea862700  0 log_channel(audit) log [DBG] : from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch
logs/ceph-mon.mon3.log:    -3> 2015-05-06 07:28:28.359718 7fcbea862700 10 log_client  will send 2015-05-06 07:28:28.359695 mon.2 10.12.27.5:6789/0 269 : audit [DBG] from='client.4762 10.12.27.2:0/1024942' entity='client.admin' cmd=[{"states": ["recovery"], "prefix": "pg ls-by-osd", "osd": 2}]: dispatch

and we normalize these "pg ls-by-*" commands to "pg ls". and try to translate the "state" to the internal representation using pg_string_state(), which brings down the mon if the string is not recognized. in this case, "recovery" is listed in the MonCommands, but not understood by pg_string_state().

the root cause of this issue is the inconsistency in between the command spec and the implementation of the command.

Actions #2

Updated by Kefu Chai almost 9 years ago

  • Backport set to hammer
Actions #3

Updated by Kefu Chai almost 9 years ago

  • Status changed from 12 to Fix Under Review
  • % Done changed from 0 to 60
Actions #4

Updated by Greg Farnum almost 9 years ago

Ah, so we're actually only querying on one kind of state at a time, but it was still broken. Coolio.

Actions #5

Updated by Sage Weil almost 9 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Loïc Dachary over 8 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF