https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2016-07-12T06:51:37Z
Ceph
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74237
2016-07-12T06:51:37Z
Xiaoxi Chen
xiaoxchen@ebay.com
<ul></ul><p>Tried but didnt reproduce.<br />Did you stably reproduce it?</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74238
2016-07-12T07:41:56Z
Oliver Dzombc
info@ip-interactive.de
<ul></ul><p>Hi,</p>
<p>jep, happens every time, 100% "success".</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74241
2016-07-12T08:52:38Z
Oliver Dzombc
info@ip-interactive.de
<ul></ul><p>Here is the current crushmap:</p>
<ol>
<li>begin crush map<br />tunable choose_local_tries 0<br />tunable choose_local_fallback_tries 0<br />tunable choose_total_tries 50<br />tunable chooseleaf_descend_once 1<br />tunable chooseleaf_vary_r 1<br />tunable straw_calc_version 1</li>
</ol>
<ol>
<li>devices<br />device 0 osd.0<br />device 1 osd.1<br />device 2 osd.2<br />device 3 osd.3<br />device 4 osd.4<br />device 5 osd.5<br />device 6 osd.6<br />device 7 osd.7<br />device 8 osd.8<br />device 9 osd.9<br />device 10 osd.10<br />device 11 osd.11<br />device 12 osd.12<br />device 13 osd.13<br />device 14 osd.14<br />device 15 osd.15</li>
</ol>
<ol>
<li>types<br />type 0 osd<br />type 1 host<br />type 2 chassis<br />type 3 rack<br />type 4 row<br />type 5 pdu<br />type 6 pod<br />type 7 room<br />type 8 datacenter<br />type 9 region<br />type 10 root</li>
</ol>
<ol>
<li>buckets<br />host cephosd2-ssd-cache {<br /> id -1 # do not change unnecessarily
# weight 0.872<br /> alg straw<br /> hash 0 # rjenkins1<br /> item osd.8 weight 0.218<br /> item osd.9 weight 0.218<br /> item osd.10 weight 0.218<br /> item osd.11 weight 0.218<br />}<br />host cephosd2-cold-storage {<br /> id -2 # do not change unnecessarily
# weight 14.548<br /> alg straw<br /> hash 0 # rjenkins1<br /> item osd.12 weight 3.637<br /> item osd.13 weight 3.637<br /> item osd.14 weight 3.637<br /> item osd.15 weight 3.637<br />}<br />host cephosd1-ssd-cache {<br /> id -3 # do not change unnecessarily
# weight 0.872<br /> alg straw<br /> hash 0 # rjenkins1<br /> item osd.0 weight 0.218<br /> item osd.1 weight 0.218<br /> item osd.2 weight 0.218<br /> item osd.3 weight 0.218<br />}<br />host cephosd1-cold-storage {<br /> id -4 # do not change unnecessarily
# weight 14.548<br /> alg straw<br /> hash 0 # rjenkins1<br /> item osd.4 weight 3.637<br /> item osd.5 weight 3.637<br /> item osd.6 weight 3.637<br /> item osd.7 weight 3.637<br />}<br />root ssd-cache {<br /> id -5 # do not change unnecessarily
# weight 1.704<br /> alg straw<br /> hash 0 # rjenkins1<br /> item cephosd1-ssd-cache weight 0.852<br /> item cephosd2-ssd-cache weight 0.852<br />}<br />root cold-storage {<br /> id -6 # do not change unnecessarily
# weight 29.094<br /> alg straw<br /> hash 0 # rjenkins1<br /> item cephosd1-cold-storage weight 14.547<br /> item cephosd2-cold-storage weight 14.547<br />}</li>
</ol>
<ol>
<li>rules<br />rule ssd-cache-rule {<br /> ruleset 1<br /> type replicated<br /> min_size 2<br /> max_size 10<br /> step take ssd-cache<br /> step chooseleaf firstn 0 type host<br /> step emit<br />}<br />rule cold-storage-rule {<br /> ruleset 2<br /> type replicated<br /> min_size 2<br /> max_size 10<br /> step take cold-storage<br /> step chooseleaf firstn 0 type host<br /> step emit<br />}</li>
</ol>
<ol>
<li>end crush map</li>
</ol>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74700
2016-07-13T07:23:10Z
Oliver Dzombc
info@ip-interactive.de
<ul></ul><p>If i set:</p>
<ol>
<li>ceph osd pool create vmware1 64 cold-storage-rule<br />pool 'vmware1' created</li>
</ol>
<p>I would expect the pool to have ruleset 2.</p>
<p>#ceph osd pool ls detail</p>
<p>pool 10 'vmware1' replicated size 3 min_size 2 crush_ruleset 1<br />object_hash rjenkins pg_num 64 pgp_num 64 last_change 483 flags<br />hashpspool stripe_width 0</p>
<p>but it has crush_ruleset 1.</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74790
2016-07-14T13:14:14Z
Oliver Dzombc
info@ip-interactive.de
<ul></ul><p>Hi,</p>
<p>so is there anything i can do, to get more info about it ?</p>
<p>Its a big problem, that we can not add any pools. crush_ruleset 1 is the ssd cache tier, so holding pool data in there, is somehow not really wanted.</p>
<p>Thank you !</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74815
2016-07-14T22:40:22Z
Oliver Dzombc
info@ip-interactive.de
<ul></ul><p>Hi Xiaoxi Chen,</p>
<p>that you have something to reproduce:</p>
<p>Edit your crushmap, remove ruleset 0.</p>
<p>So if your crushmap does not have a ruleset 0, you have the bug.</p>
<p>My crushmap had ruleset 1 and 2. There was no 0.</p>
<p>That cause the bug, reproduceable. After i fixed it, its working again as expected.</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74851
2016-07-15T11:44:56Z
Artemy Kapitula
artemy.kapitula@rcntec.com
<ul></ul><p>Exactly the same problem on 10.2.1.</p>
<p>It's DEADLY critical</p>
<pre><code>ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269)<br /> 1: (()+0x5054ba) [0x5626fe81a4ba]<br /> 2: (()+0xf100) [0x7f5e7446f100]<br /> 3: (OSDMonitor::prepare_command_pool_set(std::map&lt;std::string, boost::variant&lt;std::string, bool, long, double, std::vector&lt;std::string, std::allocator&lt;std::string&gt; >, boost::detail::variant::void_, boost<br />::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant:<br />:void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::det<br />ail::variant::void_>, std::less&lt;std::string&gt;, std::allocator&lt;std::pair&lt;std::string const, boost::variant&lt;std::string, bool, long, double, std::vector&lt;std::string, std::allocator&lt;std::string&gt; >, boost::det<br />ail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void<br /><em>, boost::detail::variant::void</em>, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::<br />variant::void_, boost::detail::variant::void_> > > >&, std::basic_stringstream&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; >&)+0x122f) [0x5626fe6268df]<br /> 4: (OSDMonitor::prepare_command_impl(std::shared_ptr&lt;MonOpRequest&gt;, std::map&lt;std::string, boost::variant&lt;std::string, bool, long, double, std::vector&lt;std::string, std::allocator&lt;std::string&gt; >, boost::de<br />tail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::voi<br />d_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail:<br />:variant::void_, boost::detail::variant::void_>, std::less&lt;std::string&gt;, std::allocator&lt;std::pair&lt;std::string const, boost::variant&lt;std::string, bool, long, double, std::vector&lt;std::string, std::allocator<br />&lt;std::string&gt; >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, b<br />oost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::vari<br />ant::void_, boost::detail::variant::void_, boost::detail::variant::void_> > > >&)+0xf02c) [0x5626fe6365ec]<br /> 5: (OSDMonitor::prepare_command(std::shared_ptr&lt;MonOpRequest&gt;)+0x64f) [0x5626fe63b3cf]<br /> 6: (OSDMonitor::prepare_update(std::shared_ptr&lt;MonOpRequest&gt;)+0x307) [0x5626fe63cf27]<br /> 7: (PaxosService::dispatch(std::shared_ptr&lt;MonOpRequest&gt;)+0xe0b) [0x5626fe5eb51b]<br /> 8: (Monitor::handle_command(std::shared_ptr&lt;MonOpRequest&gt;)+0x1d1f) [0x5626fe5a753f]<br /> 9: (Monitor::dispatch_op(std::shared_ptr&lt;MonOpRequest&gt;)+0x33b) [0x5626fe5b30bb]<br /> 10: (Monitor::_ms_dispatch(Message*)+0x6c9) [0x5626fe5b4459]<br /> 11: (Monitor::handle_forward(std::shared_ptr&lt;MonOpRequest&gt;)+0x89c) [0x5626fe5b28ec]<br /> 12: (Monitor::dispatch_op(std::shared_ptr&lt;MonOpRequest&gt;)+0xc70) [0x5626fe5b39f0]<br /> 13: (Monitor::_ms_dispatch(Message*)+0x6c9) [0x5626fe5b4459]<br /> 14: (Monitor::ms_dispatch(Message*)+0x23) [0x5626fe5d4f73]<br /> 15: (DispatchQueue::entry()+0x78a) [0x5626fea2d9fa]<br /> 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x5626fe92310d]<br /> 17: (()+0x7dc5) [0x7f5e74467dc5]<br /> 18: (clone()+0x6d) [0x7f5e72d3028d]<br /> NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.</code></pre>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74852
2016-07-15T11:59:53Z
Oliver Dzombc
info@ip-interactive.de
<ul></ul><p>Hi Artemy,</p>
<p>did you already check my work around ?</p>
<p>Simply add a ruleset with id 0 and default.</p>
<p>Something like:</p>
<p>rule default {<br /> ruleset 0<br /> type replicated<br /> min_size 2<br /> max_size 10<br /> step chooseleaf firstn 0 type host<br /> step emit<br />}</p>
<p>Should already fix the effect of the issue.</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74863
2016-07-16T05:43:25Z
Xiaoxi Chen
xiaoxchen@ebay.com
<ul></ul><p>Hi Oliver Dzombc,<br /> would you mind paste the PR link here?</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74874
2016-07-18T07:24:43Z
Artemy Kapitula
artemy.kapitula@rcntec.com
<ul></ul><blockquote>
<p>did you already check my work around ?<br />Simply add a ruleset with id 0 and default.</p>
</blockquote>
<p>Hi Oliver!</p>
<p>Yes, I tried today on test/dev cluster.<br />No effect.<br />2 of 3 mons crashed.</p>
<p>But we've got 10.2.1 now, not 10.2.2.</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74877
2016-07-18T07:53:35Z
Oliver Dzombc
info@ip-interactive.de
<ul></ul><p>Hi,</p>
<p>if you created >exactly<</p>
<p>rule default {<br />ruleset 0<br />type replicated<br />min_size 2<br />max_size 10<br />step chooseleaf firstn 0 type host<br />step emit<br />}</p>
<p>as rule, then no idea.</p>
<p>If not, please create exactly that rule and try it out.</p>
<p>Good Luck !</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74878
2016-07-18T08:42:33Z
Xiaoxi Chen
xiaoxchen@ebay.com
<ul><li><strong>Assignee</strong> set to <i>Xiaoxi Chen</i></li></ul>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=74910
2016-07-18T16:36:31Z
Xiaoxi Chen
xiaoxchen@ebay.com
<ul></ul><p>Likely fixed by this commit <a class="external" href="https://github.com/ceph/ceph/pull/8480">https://github.com/ceph/ceph/pull/8480</a></p>
<p>The problem is in 10.2.2 code we assume ruleset N is located in crush->rules[N], but this is not always true. In your case, because you don’t have ruleset 0, so when importing, ruleset 1 is in rules<sup><a href="#fn0">0</a></sup> while ruleset 2 is in rules<sup><a href="#fn1">1</a></sup>. Then when you set the ruleset of one pool to 2, in osdmap.crush->get_rule_mask_min_size(n), it will access rules<sup><a href="#fn2">2</a></sup> , definitely get a Segmentation fault.</p>
<p>Use "crush rule rm" to delete ruleset will not hit this bug, because the command just set crush->rules[N] to NULL instead of re-placing them.</p>
<p>@Artemy Kapitula, @Oliver Dzombc. It would be great if you could test against master (or cherry-pick this commit ), and maybe we would need to backport this.</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=75281
2016-07-25T06:16:15Z
Artemy Kapitula
artemy.kapitula@rcntec.com
<ul></ul><p>Hi Xiaoxi Chen!</p>
<p>I did a test with special conditions: three rulesets with ids=0,2,3:</p>
<p>rule replicated_ruleset {<br /> ruleset 0<br /> type replicated<br /> min_size 1<br /> max_size 10<br /> step take default<br /> step choose firstn 0 type osd<br /> step emit<br />}</p>
<p>rule bbb {<br /> ruleset 2<br /> type replicated<br /> min_size 1<br /> max_size 10<br /> step take default<br /> step chooseleaf firstn 0 type osd<br /> step emit<br />}</p>
<p>rule aaa {<br /> ruleset 3<br /> type replicated<br /> min_size 1<br /> max_size 10<br /> step take default<br /> step chooseleaf firstn 0 type osd<br /> step emit<br />}</p>
<p>set crush_ruleset works fine with rulesets=0,2, but breaks in segfault with ruleset=3.<br />The only workaround I found is to keep all rulesets up to max(id) existing.<br />But after a rule removal it all may crash down on the first set crush_ruleset :-)<br />I'll try to build ceph with patches suggested, but that will take some time.</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=75354
2016-07-26T06:03:11Z
Artemy Kapitula
artemy.kapitula@rcntec.com
<ul></ul><p>Xiaoxi Chen wrote:</p>
<blockquote>
<p>Likely fixed by this commit <a class="external" href="https://github.com/ceph/ceph/pull/8480">https://github.com/ceph/ceph/pull/8480</a></p>
</blockquote>
<p>Confirmed, set crush_ruleset now works well.</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=75451
2016-07-27T15:36:13Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Target version</strong> deleted (<del><i>519</i></del>)</li></ul>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=77121
2016-08-23T16:16:44Z
Kefu Chai
tchaikov@gmail.com
<ul></ul><p>might need to backport <a class="external" href="https://github.com/ceph/ceph/pull/8480">https://github.com/ceph/ceph/pull/8480</a> to jewel</p>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=77122
2016-08-23T16:17:38Z
Kefu Chai
tchaikov@gmail.com
<ul><li><strong>Tracker</strong> changed from <i>Bug</i> to <i>Backport</i></li></ul>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=77311
2016-08-25T11:45:12Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Tracker</strong> changed from <i>Backport</i> to <i>Bug</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>Pending Backport</i></li><li><strong>% Done</strong> set to <i>0</i></li><li><strong>Backport</strong> set to <i>jewel</i></li></ul>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=77314
2016-08-25T11:46:13Z
Loïc Dachary
loic@dachary.org
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/17135">Backport #17135</a>: jewel: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2</i> added</li></ul>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=79806
2016-10-14T17:03:53Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul>
Ceph - Bug #16653: ceph mon Segmentation fault after set crush_ruleset ceph 10.2.2
https://tracker.ceph.com/issues/16653?journal_id=91543
2017-05-24T15:40:13Z
Sage Weil
sage@newdream.net
<ul><li><strong>Duplicated by</strong> <i><a class="issue tracker-1 status-10 priority-3 priority-lowest closed" href="/issues/17412">Bug #17412</a>: Applying ruleset halts monitor</i> added</li></ul>