https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2018-12-08T13:44:09ZCeph RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1258232018-12-08T13:44:09ZKonstantin Shalygink0ste@k0ste.ru
<ul></ul><pre>
Dec 08 19:32:51 ceph-osd0 ceph-osd[171040]: starting osd.0 at - osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
Dec 08 19:32:58 ceph-osd0 ceph-osd[171040]: 2018-12-08 19:32:58.409947 7f1391a7ed80 -1 osd.0 53990 log_to_monitors {default=true}
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::MempoolThread::_balance_cache(const std::list<PriorityCache::PriCache*>&)' thread 7f1382673700 time 2018-12-08 19:33:00.001144
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/os/bluestore/BlueStore.cc: 3488: FAILED assert(mem_avail >= 0)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x561dd4c2d7b0]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 2: (()+0x8fc754) [0x561dd4a89754]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 3: (BlueStore::MempoolThread::entry()+0x332) [0x561dd4a8dc72]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 4: (()+0x7e25) [0x7f138ef2ce25]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 5: (clone()+0x6d) [0x7f138e01dbad]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 2018-12-08 19:33:00.003444 7f1382673700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::MempoolThread::_balance_cache(const std::list<PriorityCache::PriCache*>&)' thread 7f1382673700 time 2018-12-08 19:33:00.001144
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/os/bluestore/BlueStore.cc: 3488: FAILED assert(mem_avail >= 0)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x561dd4c2d7b0]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 2: (()+0x8fc754) [0x561dd4a89754]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 3: (BlueStore::MempoolThread::entry()+0x332) [0x561dd4a8dc72]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 4: (()+0x7e25) [0x7f138ef2ce25]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 5: (clone()+0x6d) [0x7f138e01dbad]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: -2836> 2018-12-08 19:32:58.409947 7f1391a7ed80 -1 osd.0 53990 log_to_monitors {default=true}
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 0> 2018-12-08 19:33:00.003444 7f1382673700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::MempoolThread::_balance_cache(const std::list<PriorityCache::PriCache*>&)' thread 7f1382673700 time 2018-12-08 19:33:00.001144
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.10/rpm/el7/BUILD/ceph-12.2.10/src/os/bluestore/BlueStore.cc: 3488: FAILED assert(mem_avail >= 0)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x561dd4c2d7b0]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 2: (()+0x8fc754) [0x561dd4a89754]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 3: (BlueStore::MempoolThread::entry()+0x332) [0x561dd4a8dc72]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 4: (()+0x7e25) [0x7f138ef2ce25]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 5: (clone()+0x6d) [0x7f138e01dbad]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: *** Caught signal (Aborted) **
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: in thread 7f1382673700 thread_name:bstore_mempool
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 1: (()+0xa618e1) [0x561dd4bee8e1]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 2: (()+0xf6d0) [0x7f138ef346d0]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 3: (gsignal()+0x37) [0x7f138df55277]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 4: (abort()+0x148) [0x7f138df56968]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x561dd4c2d924]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 6: (()+0x8fc754) [0x561dd4a89754]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 7: (BlueStore::MempoolThread::entry()+0x332) [0x561dd4a8dc72]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 8: (()+0x7e25) [0x7f138ef2ce25]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 9: (clone()+0x6d) [0x7f138e01dbad]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 2018-12-08 19:33:00.018742 7f1382673700 -1 *** Caught signal (Aborted) **
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: in thread 7f1382673700 thread_name:bstore_mempool
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 1: (()+0xa618e1) [0x561dd4bee8e1]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 2: (()+0xf6d0) [0x7f138ef346d0]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 3: (gsignal()+0x37) [0x7f138df55277]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 4: (abort()+0x148) [0x7f138df56968]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x561dd4c2d924]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 6: (()+0x8fc754) [0x561dd4a89754]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 7: (BlueStore::MempoolThread::entry()+0x332) [0x561dd4a8dc72]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 8: (()+0x7e25) [0x7f138ef2ce25]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 9: (clone()+0x6d) [0x7f138e01dbad]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 0> 2018-12-08 19:33:00.018742 7f1382673700 -1 *** Caught signal (Aborted) **
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: in thread 7f1382673700 thread_name:bstore_mempool
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: ceph version 12.2.10 (177915764b752804194937482a39e95e0ca3de94) luminous (stable)
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 1: (()+0xa618e1) [0x561dd4bee8e1]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 2: (()+0xf6d0) [0x7f138ef346d0]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 3: (gsignal()+0x37) [0x7f138df55277]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 4: (abort()+0x148) [0x7f138df56968]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x561dd4c2d924]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 6: (()+0x8fc754) [0x561dd4a89754]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 7: (BlueStore::MempoolThread::entry()+0x332) [0x561dd4a8dc72]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 8: (()+0x7e25) [0x7f138ef2ce25]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: 9: (clone()+0x6d) [0x7f138e01dbad]
Dec 08 19:33:00 ceph-osd0 ceph-osd[171040]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
</pre>
<p>Same. It would be nice if such changes were laid out at least approximate expectations of the <em>new</em> default settings. My osd hosts used 50-53% RAM when Bluestore cache size is default (1GB). Now I had to put 2GB for osd_memory_target in order to start and observe what the memory consumption will be.</p> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1258732018-12-10T22:37:13ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Assignee</strong> set to <i>Mark Nelson</i></li></ul><p>Thoughts, Mark?</p> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1258742018-12-10T23:18:02ZMark Nelsonmark.a.nelson@gmail.com
<ul></ul><p>Hi Folks,</p>
<p>I'm guessing this is related to <a class="external" href="https://github.com/ceph/ceph/pull/25421">https://github.com/ceph/ceph/pull/25421</a> Basically a stupid uint64_t bug on my part where the target size wraps to be huge when the osd_memory_target is set too low. My goal was not to restrict the minimum size since ideally the autotuner will just set the bluestore cache size to osd_memory_cache_min and leave it at that.</p>
<p>Can you verify that you don't see the crash if you set the osd_memory_target to something a little larger? IE make sure this expression is true:</p>
<p>(((1.0 - osd_memory_expected_fragmentation) * osd_memory_target) - osd_memory_base) > osd_memory_cache_min</p>
<p>ie for default settings:</p>
<p>osd_memory_target = (134217728 + 805306368) / 0.85 = 1105322466</p>
<p>If that fixes it then it's most likely the same issue as 25421.</p>
<p>Edit: I verified that setting osd_memory_target to 1105322466 avoids the assert on one of our test boxes. Alternatively if you tweak the osd_memory_base and/or the osd_memory_expected_fragmentation you can avoid the assert. We'll issue a fix to master tomorrow.</p>
<p>Mark</p> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1259012018-12-11T15:08:19ZDan van der Ster
<ul></ul><p>Hi Mark,</p>
<p>You got it: 1105322466 boots, and 1105322465 crashes with the above trace.</p>
<p>Cheers, Dan</p> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1262402018-12-14T23:36:30ZNeha Ojhanojha@redhat.com
<ul><li><strong>Backport</strong> set to <i>luminous,mimic</i></li></ul> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1262412018-12-14T23:49:58ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Subject</strong> changed from <i>osd_memory_target: enforce or at least document a min usable value</i> to <i>osd_memory_target: failed assert when options mismatch</i></li></ul> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1262722018-12-17T14:44:55ZYuri Weinsteinyweinste@redhat.com
<ul></ul><p>merged <a class="external" href="https://github.com/ceph/ceph/pull/25421">https://github.com/ceph/ceph/pull/25421</a></p> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1262922018-12-18T00:39:18Zxie xingguo258156334@qq.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Pending Backport</i></li></ul> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1263302018-12-18T11:10:46ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/37697">Backport #37697</a>: luminous: osd_memory_target: failed assert when options mismatch</i> added</li></ul> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1263322018-12-18T11:10:53ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/37698">Backport #37698</a>: mimic: osd_memory_target: failed assert when options mismatch</i> added</li></ul> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1281902019-01-28T14:55:19ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>15</i></li></ul> RADOS - Bug #37507: osd_memory_target: failed assert when options mismatchhttps://tracker.ceph.com/issues/37507?journal_id=1283462019-01-30T13:00:26ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>15</i> to <i>Resolved</i></li></ul>