Project

General

Profile

Bug #25107

common: (mon) command sanitization accepts floats when Int type is defined resulting in exception fault in ceph-mon

Added by Patrick Donnelly over 5 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Found this fault:

2018-07-24T21:06:03.317 INFO:teuthology.orchestra.run.smithi094.stderr:Connecting to RADOS with config /etc/ceph/ceph.conf...
2018-07-24T21:06:03.329 INFO:teuthology.orchestra.run.smithi094.stderr:Connection to RADOS complete
2018-07-24T21:06:03.329 INFO:teuthology.orchestra.run.smithi094.stderr:Connecting to cephfs...
2018-07-24T21:06:03.330 INFO:teuthology.orchestra.run.smithi094.stderr:CephFS initializing...
2018-07-24T21:06:03.332 INFO:teuthology.orchestra.run.smithi094.stderr:CephFS mounting...
2018-07-24T21:06:03.342 INFO:teuthology.orchestra.run.smithi094.stderr:Connection to cephfs complete
2018-07-24T21:06:03.343 INFO:teuthology.orchestra.run.smithi094.stderr:Recovering from partial auth updates (if any)...
2018-07-24T21:06:03.343 INFO:teuthology.orchestra.run.smithi094.stderr:Nothing to recover. No auth meta files.
2018-07-24T21:06:03.343 INFO:teuthology.orchestra.run.smithi094.stderr:create_volume: /volumes/grpid/volid
2018-07-24T21:06:03.362 INFO:teuthology.orchestra.run.smithi094.stderr:create_volume: grpid/volid, create pool fsvolume_volid as data_isolated =True.
2018-07-24T21:06:03.367 INFO:tasks.ceph.mon.a.smithi094.stderr:2018-07-24 21:06:03.369 7fd0c0eec700 -1 bad boost::get: key pg_num is not type long
2018-07-24T21:06:03.381 INFO:tasks.ceph.mon.a.smithi094.stderr:2018-07-24 21:06:03.373 7fd0c0eec700 -1  ceph version 14.0.0-1527-ga940f3f (a940f3faed0869db872f162620fcd7a6e71ff43b) nautilus (dev)
2018-07-24T21:06:03.381 INFO:tasks.ceph.mon.a.smithi094.stderr: 1: (bool cmd_getval<long>(CephContext*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > >, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > > > > > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, long&)+0xf3) [0x5652718bdb23]
2018-07-24T21:06:03.381 INFO:tasks.ceph.mon.a.smithi094.stderr: 2: (OSDMonitor::prepare_command_impl(boost::intrusive_ptr<MonOpRequest>, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > >, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, boost::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long, double, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<long, std::allocator<long> >, std::vector<double, std::allocator<double> > > > > > const&)+0x1b841) [0x565271a2d6b1]
2018-07-24T21:06:03.382 INFO:tasks.ceph.mon.a.smithi094.stderr: 3: (OSDMonitor::prepare_command(boost::intrusive_ptr<MonOpRequest>)+0x252) [0x565271a35fa2]
2018-07-24T21:06:03.383 INFO:tasks.ceph.mon.a.smithi094.stderr: 4: (OSDMonitor::prepare_update(boost::intrusive_ptr<MonOpRequest>)+0x170) [0x565271a36300]
2018-07-24T21:06:03.383 INFO:tasks.ceph.mon.a.smithi094.stderr: 5: (PaxosService::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x9a6) [0x5652719c6616]
2018-07-24T21:06:03.383 INFO:tasks.ceph.mon.a.smithi094.stderr: 6: (Monitor::handle_command(boost::intrusive_ptr<MonOpRequest>)+0x1e62) [0x56527188feb2]
2018-07-24T21:06:03.383 INFO:tasks.ceph.mon.a.smithi094.stderr: 7: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x39d) [0x56527189492d]
2018-07-24T21:06:03.384 INFO:tasks.ceph.mon.a.smithi094.stderr: 8: (Monitor::_ms_dispatch(Message*)+0x8b2) [0x5652718961b2]
2018-07-24T21:06:03.384 INFO:tasks.ceph.mon.a.smithi094.stderr: 9: (Monitor::ms_dispatch(Message*)+0x23) [0x5652718be3e3]
2018-07-24T21:06:03.384 INFO:tasks.ceph.mon.a.smithi094.stderr: 10: (DispatchQueue::entry()+0xb92) [0x7fd0ccc87392]
2018-07-24T21:06:03.384 INFO:tasks.ceph.mon.a.smithi094.stderr: 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x7fd0ccd29e6d]
2018-07-24T21:06:03.385 INFO:tasks.ceph.mon.a.smithi094.stderr: 12: (()+0x76db) [0x7fd0cbc826db]
2018-07-24T21:06:03.385 INFO:tasks.ceph.mon.a.smithi094.stderr: 13: (clone()+0x3f) [0x7fd0cac4688f]
2018-07-24T21:06:03.385 INFO:tasks.ceph.mon.a.smithi094.stderr:
2018-07-24T21:06:03.890 INFO:teuthology.orchestra.run.smithi094.stderr:create_volume: grpid/volid, using rados namespace fsvolumens_volid to isolate data.
2018-07-24T21:06:03.896 INFO:teuthology.orchestra.run.smithi094.stderr:disconnect
2018-07-24T21:06:03.896 INFO:teuthology.orchestra.run.smithi094.stderr:Disconnecting cephfs...
2018-07-24T21:06:03.901 INFO:teuthology.orchestra.run.smithi094.stderr:Disconnecting cephfs complete
2018-07-24T21:06:03.902 INFO:teuthology.orchestra.run.smithi094.stderr:Disconnecting rados...
2018-07-24T21:06:03.902 INFO:teuthology.orchestra.run.smithi094.stderr:Disconnecting rados complete
2018-07-24T21:06:03.906 INFO:teuthology.orchestra.run.smithi094.stderr:disconnect

From: /ceph/teuthology-archive/pdonnell-2018-07-24_20:41:52-fs-wip-rishabh-testing-volclient-py3compat-distro-basic-smithi/2810409/teuthology.log

Due to the division operator behavior changing from py2 to py3, the "pg_num" argument to the "osd pool create" command was transmitted as a float. The command sanitizer apparently accepted this even though the type conflicted with what's defined in src/mon/MonCommands.h. When the ceph-mon requested the value of the argument as an int64, it raised the above failure.


Related issues

Copied to Ceph - Backport #26918: mimic: common: (mon) command sanitization accepts floats when Int type is defined resulting in exception fault in ceph-mon Rejected
Copied to Ceph - Backport #26919: luminous: common: (mon) command sanitization accepts floats when Int type is defined resulting in exception fault in ceph-mon Resolved

History

#1 Updated by Patrick Donnelly over 5 years ago

  • Subject changed from common: (mon) command sanitization accepts floats when Int type is defined resulting in segfault of ceph-mon to common: (mon) command sanitization accepts floats when Int type is defined resulting in exception fault in ceph-mon

#2 Updated by Sage Weil over 5 years ago

  • Status changed from New to Fix Under Review
  • Backport changed from mimic to mimic,luminous

#3 Updated by Kefu Chai over 5 years ago

i don't think it's an "exception fault", the backtrace was printed by "handle_bad_get()", not by "assert()".

the problem is two folded

- we should fix CephFSVolumeClient._create_volume_pool(), so it sends the right json to mon.
- we should not accept parameters of wrong type in a JSON command, and silencely use the default parameter

#4 Updated by Patrick Donnelly over 5 years ago

Kefu Chai wrote:

- we should fix CephFSVolumeClient._create_volume_pool(), so it sends the right json to mon.

https://github.com/ceph/ceph/pull/21948#pullrequestreview-140818495

#5 Updated by Sage Weil over 5 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Priority changed from Urgent to Normal

let's let this bake for a while

#6 Updated by Patrick Donnelly over 5 years ago

  • Copied to Backport #26918: mimic: common: (mon) command sanitization accepts floats when Int type is defined resulting in exception fault in ceph-mon added

#7 Updated by Patrick Donnelly over 5 years ago

  • Copied to Backport #26919: luminous: common: (mon) command sanitization accepts floats when Int type is defined resulting in exception fault in ceph-mon added

#8 Updated by Nathan Cutler about 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF