https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2013-01-11T09:39:06ZCeph Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=156592013-01-11T09:39:06ZIan Colleicolle@redhat.com
<ul><li><strong>Assignee</strong> set to <i>Sage Weil</i></li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=156672013-01-11T10:11:39ZGreg Farnumgfarnum@redhat.com
<ul></ul><p>The issue here is that CRUSH maps which behave well on multi-host deployments behave quite poorly on one or two host deployments. The mkcephfs build path actually does handle this fairly politely, though, and I think (perhaps erroneously) that ceph-deploy is optimized for larger clusters.<br />Which deployment mechanism are you using?</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=156832013-01-11T10:55:14ZAnonymous
<ul></ul><p>I agree with Ian, I have seen <strong>very bad things</strong> happen when crush choses two OSD on one host, rather than distribute to different hosts.</p>
<p>It is nice to know that mkcephfs has a mechanism to balance the load so this won't happen. But this is a scalable product. Customers are suppose to use 'ceph osd add' to add more osd's to the cluster.</p>
<p>does 'ceph osd add' take into consideration crush host balancing when doing an add? Do we have instructions to manually handle that?</p>
<p>I think there should be a default rule that says the data replicas can not be written on the same host as the original. no matter how the OSD has been added.</p>
<p>just my 2cents... :-)</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157012013-01-11T11:32:39ZAnonymous
<ul></ul><p>This comment should have been in bug 3789</p>
<p>upping the memory on these VMs from 512M to 2G</p>
<p>since it appears it was a resource problem, i will close this bug.</p>
<p>do we have any mechanism that I am missing that notifies the end user when crashes like this occur? So they can go in and fix their cluster before there are a critical number of resources that have failed?</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157032013-01-11T11:34:09ZAnonymous
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Won't Fix</i></li></ul><p>This comment should have been in bug 3789</p>
<p>caused by a lack of resources on the system.<br />have increased the memory from 512M to 2G, will retest.</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157092013-01-11T12:34:59ZDan Mickdmick@redhat.com
<ul></ul><p>I think maybe Deb's comments and closure were meant for another bug (perhaps 3789?)</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157122013-01-11T13:59:47ZAnonymous
<ul><li><strong>Status</strong> changed from <i>Won't Fix</i> to <i>New</i></li></ul><p>dang! wrong bug. opening this one back up.<br />sorry all!</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157222013-01-11T17:24:01ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Fix Under Review</i></li><li><strong>Assignee</strong> changed from <i>Sage Weil</i> to <i>Greg Farnum</i></li></ul><p>wip-3785</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157242013-01-11T17:39:54ZGreg Farnumgfarnum@redhat.com
<ul></ul><p>Looks good to me. What branches do we want to cherry-pick it on.</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157252013-01-11T17:45:56ZSage Weilsage@newdream.net
<ul></ul><p>good question. let's start with bobtail.</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157272013-01-11T17:51:57ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Resolved</i></li></ul><p>Merged to master in <a class="changeset" title="osdmap: spread replicas across hosts with default crush map This is more often the case than not..." href="https://tracker.ceph.com/projects/ceph/repository/revisions/7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6">7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6</a> and cherry-picked to bobtail in commit:503917f0049d297218b1247dc0793980c39195b3.</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157292013-01-12T23:01:21ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Fix Under Review</i></li></ul><p>der, broke vstart. can you review wip-3785?</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157322013-01-13T22:01:45ZGreg Farnumgfarnum@redhat.com
<ul></ul><p><strong>sigh</strong></p>
<p>This also looks good to me, and I like it better (should have suggested this the first time around). But now I've gotten scared again; have you run this outside of vstart? :)</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=157352013-01-13T22:11:57ZSage Weilsage@newdream.net
<ul></ul><p>Nope.. which leads me to realize that that setting needs to go in teuthology's ceph.conf. Doing that now, and then I'll run it through the suite.</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=160022013-01-17T15:03:39ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Resolved</i></li></ul><p>commit:f358cb1d2b0a3a78bf59c4fd085906fcb5541bbe</p> Ceph - Bug #3785: ceph: default crush rule does not suit multi-OSD deploymentshttps://tracker.ceph.com/issues/3785?journal_id=160032013-01-17T15:17:07ZGreg Farnumgfarnum@redhat.com
<ul></ul><p>I presume we're planning to backport this to bobtail after it passes some nights of testing? Maybe we should leave the bug in "testing" until then (or we get our "Needs Backport" status!).</p>