Project

General

Profile

Bug #18054

os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")

Added by Sage Weil 10 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
-
Target version:
-
Start date:
11/28/2016
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No
Component(RADOS):
BlueStore

Description

plenty of space, but bitmap allocator fails...

   -14> 2016-11-23 22:40:13.295934 7f6347a23700 10 bluestore(/var/lib/ceph/osd/ceph-0) _balance_bluefs_freespace bluefs 1766 M free (0.47912) bluestore 86624 M free (0.939933), bluefs_ratio 0.0199906
   -13> 2016-11-23 22:40:13.295947 7f6347a23700 10 bluestore(/var/lib/ceph/osd/ceph-0) _balance_bluefs_freespace bluefs_ratio 0.0199906 < min_ratio 0.02, should gift 1767 M
   -12> 2016-11-23 22:40:13.295950 7f6347a23700 10 bluestore(/var/lib/ceph/osd/ceph-0) _balance_bluefs_freespace gifting 1853751296 (1767 M)
   -11> 2016-11-23 22:40:13.295952 7f6347a23700 10 bitmapalloc:reserve instance 140065024436224 num_used 88573 total 1474560
   -10> 2016-11-23 22:40:13.295954 7f6347a23700 10 bitmapalloc:allocate instance 140065024436224 want_size 0x6e7e0000 alloc_unit 0x10000 hint 0x0
     0> 2016-11-23 22:40:13.297552 7f6347a23700 -1 /mnt/jenkins/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/11.0.2-1896-gf08f2d5/rpm/el7/BUILD/ceph-11.0.2-1896-gf08f2d5/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::_balance_bl
uefs_freespace(std::vector<bluestore_pextent_t>*)' thread 7f6347a23700 time 2016-11-23 22:40:13.295957
/mnt/jenkins/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/11.0.2-1896-gf08f2d5/rpm/el7/BUILD/ceph-11.0.2-1896-gf08f2d5/src/os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")

 ceph version 11.0.2-1896-gf08f2d5 (f08f2d580391855858361f9e6b24f3778b165d4e)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f63636b1c15]
 2: (BlueStore::_balance_bluefs_freespace(std::vector<bluestore_pextent_t, std::allocator<bluestore_pextent_t> >*)+0x1295) [0x7f6363417fe5]
 3: (BlueStore::_kv_sync_thread()+0x14f7) [0x7f63634297d7]
 4: (BlueStore::KVSyncThread::entry()+0xd) [0x7f636344c98d]

/a/sage-2016-11-23_20:05:59-rados-wip-sage-testing---basic-smithi/573166

History

#1 Updated by Sage Weil 10 months ago

/a/sage-2016-11-30_17:15:54-rados-wip-sage-testing---basic-smithi/589941

#2 Updated by Sage Weil 10 months ago

/a/sage-2016-12-03_19:33:05-rados-master---basic-smithi/599285

#3 Updated by Sage Weil 10 months ago

/a/sage-2016-12-03_19:34:03-rados-wip-sage-testing---basic-smithi/599494

#4 Updated by Sage Weil 10 months ago

  • Status changed from Verified to Need More Info

#5 Updated by Sage Weil 10 months ago

  • Assignee set to Sage Weil

#6 Updated by Ramesh Chander 10 months ago

In one of the core, We see here that gift is around 2GB and same is reserved in the allocator:

In the loop below:

3571 while (gift > 0) {
3572 uint64_t eoffset;
3573 uint32_t elength;
3574 r = alloc->allocate(gift, min_alloc_size, hint, &eoffset, &elength);
3575 if (r < 0) {
3576 derr << func << " allocate failed on 0x" << std::hex << gift
3577 << " min_alloc_size 0x" << min_alloc_size << std::dec << dendl;
3578 alloc->dump();
3579 assert(0 == "allocate failed, wtf");
3580 return r;
3581 }
3582
3583 bluestore_pextent_t e(eoffset, elength);
3584 dout(1) << func << " gifting " << e << " to bluefs" << dendl;
3585 extents->push_back(e);
3586 gift -= e.length;
3587 hint = e.end();
3588 }

It is using allocate api and that API returns success only if could allocate all contiguous space.

Since current max alloc size in bitmap allocator is 4MB because of zone size is 1024 (1024 * 4k), the allocation cannot be done and it fails.

The fix would be to split our allocation to bluefs_min_alloc_size or to at least zone size here.

something similar to :

https://github.com/ceph/ceph/pull/12355

#7 Updated by Orlando Moreno 10 months ago

I have also seen this error during my testing. During a 4KB random write test using libRBD FIO, OSDs start to fail. I see the following:

5> 2016-12-07 11:32:36.119346 7fcc493ff700  5 - op tracker -- seq: 1791611, time: 2016-12-07 11:32:36.119346, event: queued_for_pg, op: osd_repop(client.4537.0:283524 1.cf1)
4> 2016-12-07 11:32:36.119837 7fcc493ff700 5 - op tracker -- seq: 1791612, time: 2016-12-07 11:32:36.119837, event: queued_for_pg, op: osd_repop(client.4567.0:169290 1.11f1)
3> 2016-12-07 11:32:36.120067 7fcc47bff700 5 - op tracker -- seq: 1791613, time: 2016-12-07 11:32:36.120066, event: queued_for_pg, op: osd_repop(client.4533.0:195708 1.ffb)
2> 2016-12-07 11:32:36.120092 7fcc493ff700 5 - op tracker -- seq: 1791614, time: 2016-12-07 11:32:36.120092, event: queued_for_pg, op: osd_repop(client.4526.0:227905 1.f1f)
1> 2016-12-07 11:32:36.120149 7fcc48bfe700 5 - op tracker -- seq: 1791615, time: 2016-12-07 11:32:36.120149, event: queued_for_pg, op: osd_repop(client.4567.0:156042 1.845)
0> 2016-12-07 11:32:36.120322 7fcc313f1700 -1 ** Caught signal (Aborted) *
in thread 7fcc313f1700 thread_name:bstore_kv_sync
ceph version 11.0.2-2270-g71ef0cb (71ef0cbd8e573e98b37b18e2fd817d7334b90ff3)
1: (()+0x92098a) [0x7fcc4f5e198a]
2: (()+0xf100) [0x7fcc4c96d100]
3: (gsignal()+0x37) [0x7fcc4b78b5f7]
4: (abort()+0x148) [0x7fcc4b78cce8]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x267) [0x7fcc4f783577]
6: (BlueFS::_allocate(unsigned char, unsigned long, std::vector&lt;bluefs_extent_t, mempool::pool_allocator<(mempool::pool_index_t)8, bluefs_extent_t> >)+0x8a8) [0x7fcc4f595a28]
7: (BlueFS::_allocate(unsigned char, unsigned long, std::vector&lt;bluefs_extent_t, mempool::pool_allocator<(mempool::pool_index_t)8, bluefs_extent_t> >
)+0x569) [0x7fcc4f5956e9]
8: (BlueFS::_flush_range(BlueFS::FileWriter*, unsigned long, unsigned long)+0xbb9) [0x7fcc4f59b169]
9: (BlueFS::_flush(BlueFS::FileWriter*, bool)+0x198) [0x7fcc4f59c5a8]
10: (BlueFS::_fsync(BlueFS::FileWriter*, std::unique_lock&lt;std::mutex&gt;&)+0x82) [0x7fcc4f59f772]
11: (BlueRocksWritableFile::Sync()+0x62) [0x7fcc4f5b64d2]
12: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x149) [0x7fcc4f6c14e9]
13: (rocksdb::WritableFileWriter::Sync(bool)+0xb0) [0x7fcc4f6c2160]
14: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool)+0x1297) [0x7fcc4f60b217]
15: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x27) [0x7fcc4f60be67]
16: (RocksDBStore::submit_transaction_sync(std::shared_ptr&lt;KeyValueDB::TransactionImpl&gt;)+0xdd) [0x7fcc4f535aad]
17: (BlueStore::_kv_sync_thread()+0x1a1c) [0x7fcc4f4f97fc]
18: (BlueStore::KVSyncThread::entry()+0xd) [0x7fcc4f51c31d]
19: (()+0x7dc5) [0x7fcc4c965dc5]
20: (clone()+0x6d) [0x7fcc4b84cced]

I don't know if this error is related to the one reported, but I do see the reported error on OSDs trying to recover from OSDs that fail from this issue.

#8 Updated by David Zafman 9 months ago

teuthology:/a/dzafman-2016-12-13_08:27:18-rados-master-distro-basic-smithi/630584

2016-12-13 16:35:52.705128 7f283e7d7700  0 bitalloc:5:: 0x0
2016-12-13 16:35:52.705129 7f283e7d7700  0 bitalloc:6:: 0x0
2016-12-13 16:35:52.705129 7f283e7d7700  0 bitalloc:7:: 0x0
2016-12-13 16:35:52.705130 7f283e7d7700  0 bitalloc:8:: 0x0
2016-12-13 16:35:52.705132 7f283e7d7700  0 bitalloc:9:: 0x0
2016-12-13 16:35:52.705133 7f283e7d7700  0 bitalloc:10:: 0x0
2016-12-13 16:35:52.705124 7f283f7d9700 20 osd.0 22 handle_pg_stats_ack  timeout now 30
2016-12-13 16:35:52.705134 7f283e7d7700  0 bitalloc:11:: 0x0
2016-12-13 16:35:52.705135 7f283e7d7700  0 bitalloc:12:: 0x0
2016-12-13 16:35:52.705135 7f283e7d7700  0 bitalloc:13:: 0x0
2016-12-13 16:35:52.705136 7f283e7d7700  0 bitalloc:14:: 0x0
2016-12-13 16:35:52.705137 7f283e7d7700  0 bitalloc:15:: 0x0
2016-12-13 16:35:52.705138 7f283e7d7700  0 bitalloc:dump_state zone 1438 dump end
2016-12-13 16:35:52.705138 7f283e7d7700  0 bitalloc:dump_state zone 1439 dump start
2016-12-13 16:35:52.705139 7f283e7d7700  0 bitalloc:0:: 0x0
2016-12-13 16:35:52.705136 7f283f7d9700 25 osd.0 22  ack on 4.0 22:220
2016-12-13 16:35:52.705140 7f283e7d7700  0 bitalloc:1:: 0x0
2016-12-13 16:35:52.705141 7f283e7d7700  0 bitalloc:2:: 0x0
2016-12-13 16:35:52.705141 7f283e7d7700  0 bitalloc:3:: 0x0
2016-12-13 16:35:52.705142 7f283e7d7700  0 bitalloc:4:: 0x0
2016-12-13 16:35:52.705143 7f283e7d7700  0 bitalloc:5:: 0x0
2016-12-13 16:35:52.705144 7f283e7d7700  0 bitalloc:6:: 0x0
2016-12-13 16:35:52.705141 7f283f7d9700 25 osd.0 22  ack on 4.3 22:240
2016-12-13 16:35:52.705144 7f283e7d7700  0 bitalloc:7:: 0x0
2016-12-13 16:35:52.705145 7f283e7d7700  0 bitalloc:8:: 0x0
2016-12-13 16:35:52.705146 7f283e7d7700  0 bitalloc:9:: 0x0
2016-12-13 16:35:52.705146 7f283e7d7700  0 bitalloc:10:: 0x0
2016-12-13 16:35:52.705148 7f283e7d7700  0 bitalloc:11:: 0x0
2016-12-13 16:35:52.705148 7f283e7d7700  0 bitalloc:12:: 0x0
2016-12-13 16:35:52.705149 7f283e7d7700  0 bitalloc:13:: 0x0
2016-12-13 16:35:52.705146 7f283f7d9700 20 osd.0 22 handle_pg_stats_ack  still pending: 19
2016-12-13 16:35:52.705150 7f283e7d7700  0 bitalloc:14:: 0x0
2016-12-13 16:35:52.705151 7f283e7d7700  0 bitalloc:15:: 0x0
2016-12-13 16:35:52.705152 7f283e7d7700  0 bitalloc:dump_state zone 1439 dump end
2016-12-13 16:35:52.705153 7f283e7d7700  0 bitmapalloc:dump instance 139811443088256 committing: 0 extents
2016-12-13 16:35:52.705155 7f283e7d7700  0 bitmapalloc:dump instance 139811443088256 uncommitted: 0 extents
2016-12-13 16:35:52.706746 7f283e7d7700 -1 /build/ceph-11.0.2-2428-g0be3f5f/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::_balance_bluefs_freespace(std::vector<bluestore_pextent_t>*)' thread 7f283e7d7700 time 2016-12-13 16:35:52.705160
/build/ceph-11.0.2-2428-g0be3f5f/src/os/bluestore/BlueStore.cc: 3579: FAILED assert(0 == "allocate failed, wtf")

 ceph version 11.0.2-2428-g0be3f5f (0be3f5f72e169fad08dcb0240de45287b567bd49)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f28593d3efb]
 2: (BlueStore::_balance_bluefs_freespace(std::vector<bluestore_pextent_t, std::allocator<bluestore_pextent_t> >*)+0xea6) [0x7f285912f9a6]
 3: (BlueStore::_kv_sync_thread()+0x1529) [0x7f2859141339]
 4: (BlueStore::KVSyncThread::entry()+0xd) [0x7f28591636ed]
 5: (()+0x8184) [0x7f28579ca184]
 6: (clone()+0x6d) [0x7f285611e37d]

#9 Updated by David Zafman 9 months ago

  • Status changed from Need More Info to Verified

#10 Updated by Sage Weil 9 months ago

  • Status changed from Verified to Resolved

#11 Updated by Greg Farnum 3 months ago

  • Project changed from Ceph to RADOS
  • Category deleted (107)

#12 Updated by Greg Farnum 3 months ago

  • Component(RADOS) BlueStore added

Also available in: Atom PDF