Project

General

Profile

Bug #23387

Building Ceph on armhf fails due to out-of-memory

Added by Daniel Glaser about 6 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

I'm currently struggling with building ceph through make-deps.sh on a armhf (namely the ODROID HC2). Everything works slow, but like a charme until it get to building the ceph_dencoder.cc. This consumes a real huge amount of RAM. I have 2GB of RAM and some 20 GB more of swap, just to make sure it will not suffer from low RAM. The problem I believe is, that it stops, when virtual memory exceeds 3 GB, that is, what 32 bit arm can address for Applications (kernel split at 3 + 1 GB).

I tried to save some memory by disabling "-g" for this compilation unit and it finished build, but I assume, that is not the affordable option to work around this problem.

Could anybody suggest a solution to this, because it will also impede other 32-bit builds.

I currently run the armbian version of Debian 9 (stretch) with gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1).

The checkout from git is based on following commit:
commit 31928fea79076a73b873d9491947bdf28f418327
Merge: eb3c67fe7d ffaf1428bd
Author: Josh Durgin <>
Date: Mon Mar 12 14:12:10 2018 -0700

ceph-compile-errors.txt View (5.72 KB) Louwrentius Louwrentius, 10/29/2018 08:04 PM

History

#1 Updated by Daniel Glaser about 6 years ago

Forgot to mention the exact place it breaks:

virtual memory exhausted: Cannot allocate memory
src/CMakeFiles/ceph-dencoder.dir/build.make:66: recipe for target 'src/CMakeFiles/ceph-dencoder.dir/test/encoding/ceph_dencoder.cc.o' failed
make[4]: *** [src/CMakeFiles/ceph-dencoder.dir/test/encoding/ceph_dencoder.cc.o] Error 1
make[4]: Leaving directory '/tmp/release/Debian/WORKDIR/ceph-12.2.4-38-g31928fea79/obj-arm-linux-gnueabihf'
CMakeFiles/Makefile2:1217: recipe for target 'src/CMakeFiles/ceph-dencoder.dir/all' failed
make[3]: *** [src/CMakeFiles/ceph-dencoder.dir/all] Error 2
make[3]: Leaving directory '/tmp/release/Debian/WORKDIR/ceph-12.2.4-38-g31928fea79/obj-arm-linux-gnueabihf'
Makefile:141: recipe for target 'all' failed
make[2]: *** [all] Error 2
make[2]: Leaving directory '/tmp/release/Debian/WORKDIR/ceph-12.2.4-38-g31928fea79/obj-arm-linux-gnueabihf'
    cd /tmp/release/Debian/WORKDIR/ceph-12.2.4-38-g31928fea79
dh_auto_build: make -j1 returned exit code 2
debian/rules:40: recipe for target 'override_dh_auto_build' failed
make[1]: *** [override_dh_auto_build] Error 2
make[1]: Leaving directory '/tmp/release/Debian/WORKDIR/ceph-12.2.4-38-g31928fea79'
debian/rules:33: recipe for target 'build' failed
make: *** [build] Error 2

The log of /proc/PID/stat vmem column says the following:

datetime Ymd_HMS|cc1plus-PID:VMEM (MiB)
20180316_103028 |  20934 :     33M
20180316_103033 |  20934 :    147M
20180316_103038 |  20934 :    225M
20180316_103043 |  20934 :    341M
20180316_103048 |  20980 :    108M
20180316_103053 |  20980 :    213M
20180316_103058 |  20980 :    332M
20180316_103104 |  20980 :    442M
...
20180316_110258 |  20980 :   3006M
20180316_110303 |  20980 :   3014M
20180316_110308 |  20980 :   3022M
20180316_110313 |  20980 :   3028M
20180316_110318 |  20980 :   3036M
20180316_110323 |  20980 :   3042M
20180316_110328 |  20980 :   3048M
20180316_110333 |  20980 :   3054M

#2 Updated by Daniel Glaser over 5 years ago

I found a way (it is not directly a solution) to this problem, but using Clang/LLVM instead of the GCC toolchain, I managed to compile ceph.

Here is my (partly finished) bash script for building Ceph on ARM:

Building LLVM/Clang

$ cd $HOME
$ mkdir git && cd git
$ git clone https://github.com/llvm-mirror/llvm.git
$ git checkout stable
$ cd llvm/tools
$ git clone https://github.com/llvm-mirror/clang.git
$ git clone https://github.com/llvm-mirror/lld.git
$ cd /tmp
$ mkdir llvm-build && cd llvm-build
$ cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=ARM;X86;AArch64;Mips
$ cd $HOME/git/llvm/
$ make -j4
$ make install
$ update-alternatives --install /usr/bin/cc cc /usr/local/bin/clang 100
$ update-alternatives --install /usr/bin/c++ c++ /usr/local/bin/clang++ 100
$ update-alternatives --install /usr/bin/cpp cpp /usr/local/bin/clang-cpp 100

Building Ceph:

$ cd $HOME/git
$ git clone https://github.com/the78mole/ceph.git
$ git checkout luminous # Falls nicht master gebaut werden soll
$ git reset --hard
$ git clean -dxf
$ git submodule update --init --recursive
$ ./install-deps.sh # Ignore errors on ca-certificates-java
$ sed -i 's/\(WITH_CEPHFS_JAVA\)=ON/\1=OFF/' debian/rules
$ rm debian/libcephfs-java.jlib debian/libcephfs-jni.install
$ sed <delete last line with jni from debian/cephfs-test>
$ git add .
$ git commit -m "Disabled Java for Debian package build" 
$ ./make-debs.sh

I'm not sure, if my trace is complete and fully functional and some pieces are still pseudo-code, but it should give a hint on how to compile it. I would be happy to recieve responses on it.

#3 Updated by Louwrentius Louwrentius over 5 years ago

Hello!

I've used the instruction created by Daniel Glasser and with some small code adjustments in a few files I was able to compile Ceph 12.2.8 on a Raspberry Pi 3+. Compiling Ceph on this devices including the creation of packages takes at least 8-12 hours and I'm using an external USB SSD for storage and swap...

The most important issues I faced were related to a difference in behaviour with LLVM/Clan.

I was able to move my monitors from some VMs running in my test environment to 3 Raspberry Pi 3+ devices, and they work fine.

I'd like to share these adjustments here, maybe they can be incorporated in the code base if they make any sense.
Anyone who wants to compile Ceph for themselves can make these ajustments on their own.

If for some reason the build would fail with some error after you run ./make-debs.sh you can restart with

dpkg-buildpackage -j4 -us -us -nc (not sure if that double -us is required)

If you do plan to build this code on the Raspberry Pi, you need an external USB SSD and create either a swap file or swap partition on the device of at least 2 GB because 1 GB of memory is not enough to compile Ceph. Also make sure you cool your Raspberry Pi properly...

FileStore.cc
296c296
<   r = ::open((*path)->path(), flags|O_CLOEXEC, 0644);
---
>   r = ::open((*path)->path(), flags, 0644);

764c764
< FileStoreBackend *FileStoreBackend::create(unsigned long f_type, FileStore *fs)
---
> FileStoreBackend *FileStoreBackend::create(long f_type, FileStore *fs)

784c784
< void FileStore::create_backend(unsigned long f_type)
---
> void FileStore::create_backend(long f_type)

=== FileStore.h
60c60
< #define BTRFS_SUPER_MAGIC 0x9123683EUL
---
> #define BTRFS_SUPER_MAGIC 0x9123683EL

168c168
<   void create_backend(unsigned long f_type);
---
>   void create_backend(long f_type);

790c790
<   unsigned long m_fs_type;
---
>   long m_fs_type;

870c870
<   static FileStoreBackend *create(unsigned long f_type, FileStore *fs);
---
>   static FileStoreBackend *create(long f_type, FileStore *fs);

=== rgw_rados.cc
3823c3825
<   auto handles = std::vector<librados::Rados>{static_cast<unsigned int>(cct->_conf->rgw_num_rados_handles)};
---
>   auto handles = std::vector<librados::Rados>{cct->_conf->rgw_num_rados_handles};

===rgw_sync_log_trim.cc
896c896
<       wait(utime_t{static_cast<time_t>(config.trim_interval_sec), 0});
---
>       wait(utime_t{config.trim_interval_sec, 0});

I have attached a list of all the error I encountered and I've fixed with the changes above for anyone who is interested.

#4 Updated by Louwrentius Louwrentius over 5 years ago

The above changes is not entirely correct. This section needs to be ommited:

296c296
<   r = ::open((*path)->path(), flags|O_CLOEXEC, 0644);
---
>   r = ::open((*path)->path(), flags, 0644);

I have made the changes in the code and created a pull request in github.
https://github.com/louwrentius/ceph/pull/1

#5 Updated by Kefu Chai almost 5 years ago

  • Project changed from devops to RADOS
  • Category deleted (chef)
  • Status changed from New to Resolved
  • Target version deleted (v12.2.5)
  • Pull request ID set to 25729

i am resolving this issue. as quite a few (probably all) of issues noted by Louwrentius have been addressed by Daniel's fixes.

Also available in: Atom PDF