Project

General

Profile

Bug #9794

vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS

Added by Loic Dachary almost 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Category:
Monitor
Target version:
-
Start date:
10/15/2014
Due date:
% Done:

100%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

It must be something specific to my environment that triggers it because this is exactly the same command that is run as part of make check and it succeeds in the gitbuilder. If I remove the "paxos propose interval = 0.01" the problem does not show.

$ ./ceph --version
ceph version 0.86-377-g4606af7 (4606af7901392f44c162bee88bde432d430ffd4a)
$ MDS=1 MON=1 OSD=3 ./vstart.sh -o 'paxos propose interval = 0.01' -n -l mon osd mds
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
pool 'cephfs_data' created
./ceph -c /home/loic/software/ceph/ceph/src/ceph.conf -k /home/loic/software/ceph/ceph/src/keyring osd pool create cephfs_metadata 8
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
pool 'cephfs_metadata' created
./ceph -c /home/loic/software/ceph/ceph/src/ceph.conf -k /home/loic/software/ceph/ceph/src/keyring fs new cephfs cephfs_metadata cephfs_data
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2014-10-15 18:03:03.513076 7fe8e0e30700  0 monclient: hunting for new mon
2014-10-15 18:03:03.513209 7fe8e04de700  0 -- 127.0.0.1:0/1030496 >> 127.0.0.1:6789/0 pipe(0x7fe8cc005830 sd=7 :0 s=1 pgs=0 cs=0 l=1 c=0x7fe8cc009ad0).fault
2014-10-15 18:03:06.012973 7fe8e05df700  0 -- 127.0.0.1:0/1030496 >> 127.0.0.1:6789/0 pipe(0x7fe8c0000c00 sd=8 :0 s=1 pgs=0 cs=0 l=1 c=0x7fe8c0004ea0).fault

the mon log end with
   -26> 2014-10-15 18:03:03.079550 7fa64d59a700 10 mon.a@0(leader).pg v15 update_logger
   -25> 2014-10-15 18:03:03.079605 7fa64d59a700 10 mon.a@0(leader).log v32 update_from_paxos
   -24> 2014-10-15 18:03:03.079610 7fa64d59a700 10 mon.a@0(leader).log v32 update_from_paxos version 32 summary v 32
   -23> 2014-10-15 18:03:03.079662 7fa64d59a700 10 mon.a@0(leader).auth v6 update_from_paxos
   -22> 2014-10-15 18:03:03.079668 7fa64d59a700 10 mon.a@0(leader).pg v15 map_pg_creates to 17 pgs -- no change
   -21> 2014-10-15 18:03:03.079674 7fa64d59a700 10 mon.a@0(leader).pg v15 send_pg_creates to 17 pgs
   -20> 2014-10-15 18:03:03.079711 7fa64bca7700  1 -- 127.0.0.1:6789/0 <== client.4111 127.0.0.1:0/1030496 8 ==== mon_command({"prefix": "fs new", "data": "cephfs_data", "fs_name": "cephfs", "metadata": "cephfs_metadata"} v 0) v1 ==== 137+0+0 (2653447513 0 0) 0x4b1fc00 con 0x4e798c0
   -19> 2014-10-15 18:03:03.307378 7fa64d59a700 10 mon.a@0(leader).pg v15 create_pending v 16
   -18> 2014-10-15 18:03:03.307393 7fa64d59a700 10 mon.a@0(leader).pg v15 check_osd_map already seen 13 >= 13
   -17> 2014-10-15 18:03:03.307394 7fa64d59a700 10 mon.a@0(leader).pg v15 update_logger
   -16> 2014-10-15 18:03:03.307411 7fa64d59a700  0 log_channel(cluster) log [INF] : pgmap v15: 24 pgs: 2 active+undersized+degraded, 17 creating, 5 peering; 0 bytes data, 164 GB used, 18242 MB / 182 GB avail
   -15> 2014-10-15 18:03:03.307415 7fa64d59a700 10 log_client _send_to_monlog to self
   -14> 2014-10-15 18:03:03.307416 7fa64d59a700 10 log_client  log_queue is 3 last_log 58 sent 57 num 3 unsent 1 sending 1
   -13> 2014-10-15 18:03:03.307419 7fa64d59a700 10 log_client  will send 2014-10-15 18:03:03.307413 mon.0 127.0.0.1:6789/0 58 : cluster [INF] pgmap v15: 24 pgs: 2 active+undersized+degraded, 17 creating, 5 peering; 0 bytes data, 164 GB used, 18242 MB / 182 GB avail
   -12> 2014-10-15 18:03:03.307431 7fa64d59a700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6789/0 -- log(1 entries) v1 -- ?+0 0x4b2bf80 con 0x4b3c9a0
   -11> 2014-10-15 18:03:03.307625 7fa64bca7700  0 mon.a@0(leader) e1 handle_command mon_command({"prefix": "fs new", "data": "cephfs_data", "fs_name": "cephfs", "metadata": "cephfs_metadata"} v 0) v1
   -10> 2014-10-15 18:03:03.307658 7fa64bca7700 10 mon.a@0(leader) e1 _allowed_command capable
    -9> 2014-10-15 18:03:03.307664 7fa64bca7700  0 log_channel(audit) log [INF] : from='client.? 127.0.0.1:0/1030496' entity='client.admin' cmd=[{"prefix": "fs new", "data": "cephfs_data", "fs_name": "cephfs", "metadata": "cephfs_metadata"}]: dispatch
    -8> 2014-10-15 18:03:03.307667 7fa64bca7700 10 log_client _send_to_monlog to self
    -7> 2014-10-15 18:03:03.307667 7fa64bca7700 10 log_client  log_queue is 4 last_log 59 sent 58 num 4 unsent 1 sending 1
    -6> 2014-10-15 18:03:03.307669 7fa64bca7700 10 log_client  will send 2014-10-15 18:03:03.307666 mon.0 127.0.0.1:6789/0 59 : audit [INF] from='client.? 127.0.0.1:0/1030496' entity='client.admin' cmd=[{"prefix": "fs new", "data": "cephfs_data", "fs_name": "cephfs", "metadata": "cephfs_metadata"}]: dispatch
    -5> 2014-10-15 18:03:03.307675 7fa64bca7700  1 -- 127.0.0.1:6789/0 --> 127.0.0.1:6789/0 -- log(1 entries) v1 -- ?+0 0x4b2d180 con 0x4b3c9a0
    -4> 2014-10-15 18:03:03.307683 7fa64bca7700  5 mon.a@0(leader).paxos(paxos writing c 1..68) is_readable = 1 - now=2014-10-15 18:03:03.307684 lease_expire=0.000000 has v0 lc 68
    -3> 2014-10-15 18:03:03.307697 7fa64bca7700 10 mon.a@0(leader).mds e1 preprocess_query mon_command({"prefix": "fs new", "data": "cephfs_data", "fs_name": "cephfs", "metadata": "cephfs_metadata"} v 0) v1 from client.4111 127.0.0.1:0/1030496
    -2> 2014-10-15 18:03:03.307798 7fa64bca7700  7 mon.a@0(leader).mds e1 prepare_update mon_command({"prefix": "fs new", "data": "cephfs_data", "fs_name": "cephfs", "metadata": "cephfs_metadata"} v 0) v1
    -1> 2014-10-15 18:03:03.307917 7fa64bca7700 10 mon.a@0(leader).osd e13 Set pool 1 crash_replay_interval=45
     0> 2014-10-15 18:03:03.314437 7fa64bca7700 -1 mon/PaxosService.h: In function 'void PaxosService::request_proposal(PaxosService*)' thread 7fa64bca7700 time 2014-10-15 18:03:03.307932
mon/PaxosService.h: 309: FAILED assert(other->is_writeable())

 ceph version 0.86-377-g4606af7 (4606af7901392f44c162bee88bde432d430ffd4a)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x107cb59]
 2: (PaxosService::request_proposal(PaxosService*)+0x65) [0xea0ddf]
 3: (MDSMonitor::management_command(std::string const&, std::map<std::string, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, std::less<std::string>, std::allocator<std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&, std::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >&, int&)+0xc5b) [0xe98e9f]
 4: (MDSMonitor::prepare_command(MMonCommand*)+0x2a8) [0xe97d0a]
 5: (MDSMonitor::prepare_update(PaxosServiceMessage*)+0x18b) [0xe93033]
 6: (PaxosService::dispatch(PaxosServiceMessage*)+0xaeb) [0xe13837]
 7: (Monitor::handle_command(MMonCommand*)+0x128b) [0xd7e461]
 8: (Monitor::dispatch(MonSession*, Message*, bool)+0x28b) [0xd861bb]
 9: (Monitor::_ms_dispatch(Message*)+0xdc2) [0xd85dc6]
 10: (Monitor::ms_dispatch(Message*)+0x38) [0xdba4d8]
 11: (Messenger::ms_deliver_dispatch(Message*)+0x9b) [0x11c6323]
 12: (DispatchQueue::entry()+0x38a) [0x11c590c]
 13: (DispatchQueue::DispatchThread::entry()+0x1c) [0x105f29c]
 14: (Thread::entry_wrapper()+0x84) [0x106aece]
 15: (Thread::_entry_func(void*)+0x18) [0x106ae40]
 16: (()+0x8182) [0x7fa65274e182]
 17: (clone()+0x6d) [0x7fa650d3c30d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Related issues

Related to Ceph - Bug #9321: pgmap updates from OSDMap can be delayed indefinitely Resolved 09/02/2014

Associated revisions

Revision 2ae1cba5 (diff)
Added by Joao Eduardo Luis almost 5 years ago

mon: MDSMonitor: wait for osdmon to be writable when requesting proposal

Otherwise we may end up requesting the osdmon to propose while it is
mid-proposal. We can't simply return EAGAIN to the user either because
then we would have to expect the user to be able to successfully race
with the whole cluster in finding a window in which 'mds fs new' command
would succeed -- which is not a realistic expectation. Having the
command to osdmon()->wait_for_writable() guarantees that the command
will be added to a queue and that we will, eventually, tend to it.

Fixes: #9794

Signed-off-by: Joao Eduardo Luis <>

Revision 5f4e62f3 (diff)
Added by Joao Eduardo Luis over 4 years ago

mon: MDSMonitor: wait for osdmon to be writable when requesting proposal

Otherwise we may end up requesting the osdmon to propose while it is
mid-proposal. We can't simply return EAGAIN to the user either because
then we would have to expect the user to be able to successfully race
with the whole cluster in finding a window in which 'mds fs new' command
would succeed -- which is not a realistic expectation. Having the
command to osdmon()->wait_for_writable() guarantees that the command
will be added to a queue and that we will, eventually, tend to it.

Fixes: #9794

Signed-off-by: Joao Eduardo Luis <>
(cherry picked from commit 2ae1cba595d9c56a0a4c534b34fe25250e7eb2d5)

History

#1 Updated by Loic Dachary almost 5 years ago

reproduced on a fresh ubuntu 14.04 with v0.86-408-gad2514d

#2 Updated by Loic Dachary almost 5 years ago

  • Status changed from New to Verified
  • Assignee set to Loic Dachary

It works on v0.85, bissecting

#3 Updated by Loic Dachary almost 5 years ago

d15ecafea474e17f055ecd2717e9eff3259b6148 is the first bad commit
commit d15ecafea474e17f055ecd2717e9eff3259b6148
Author: John Spray <john.spray@redhat.com>
Date:   Fri Sep 26 17:24:12 2014 +0100

    vstart: create fewer pgs for fs pools

    Just so that dev clusters come up more
    quickly.

    Signed-off-by: John Spray <john.spray@redhat.com>

:040000 040000 f42a324a87c36513a1fd39b8dce750b9e3687720 aa64cdc1ed3eb629dfd3b60439fcf7790b64257f M    src
bisect run success

#4 Updated by Loic Dachary almost 5 years ago

Reverting to 128 PG on master makes the problem disapear. 92 PG also works. 64 PG fails.

#5 Updated by Joao Eduardo Luis almost 5 years ago

Loic, try this patch with the same conditions in which you triggered it: c84a13ae87eed5550bafda394d983a8e843cc08c

that patch has been sitting on wip-9321.giant and is intended to fix the behavior that is causing the daemon to assert (MDSMonitor needs to wait on osdmon to be writeable before requesting it to propose).

Hopefully you're able to reproduce it reliably enough to make sure the patch fixes it.

#6 Updated by Loic Dachary almost 5 years ago

  • Status changed from Verified to Testing
  • Assignee changed from Loic Dachary to Joao Eduardo Luis
  • % Done changed from 0 to 90

I confirm that

  • the problem can be reproduced 100% of the time on my laptop,
  • that cherry-pick c84a13ae87eed5550bafda394d983a8e843cc08c on d15ecafea474e17f055ecd2717e9eff3259b6148 (the commit where it started to show) fixes it
  • the fix makes sense

I'm not a big fan of bool const and prefer const bool btw ;-)

#7 Updated by Samuel Just almost 5 years ago

  • Status changed from Testing to Resolved

#8 Updated by Samuel Just almost 5 years ago

  • Status changed from Resolved to Testing

#9 Updated by Joao Eduardo Luis almost 5 years ago

  • Status changed from Testing to Resolved

checked firefly; no need to backport.

#10 Updated by Joao Eduardo Luis almost 5 years ago

  • % Done changed from 90 to 100

Also available in: Atom PDF