Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2023-06-06T10:51:16ZCeph
Redmine Ceph - Bug #61598 (New): gcc-14: FTBFS "error: call to non-'constexpr' function 'virtual unsigned...https://tracker.ceph.com/issues/615982023-06-06T10:51:16ZTim Serongtserong@suse.com
<p>gcc 14 has introduced a change which results in ceph build failures:</p>
<pre>
[ 270s] /home/abuild/rpmbuild/BUILD/ceph-18.0.0-4135-g87cd54281c8/src/osd/osd_types.h: In lambda function:
[ 270s] /home/abuild/rpmbuild/BUILD/ceph-18.0.0-4135-g87cd54281c8/src/common/dout.h:184:73: error: call to non-'constexpr' function 'virtual unsigned int DoutPrefixProvider::get_subsys() const'
[ 270s] 184 | dout_impl(pdpp->get_cct(), ceph::dout::need_dynamic(pdpp->get_subsys()), v) \
[ 270s] | ~~~~~~~~~~~~~~~~^~
[ 270s] /home/abuild/rpmbuild/BUILD/ceph-18.0.0-4135-g87cd54281c8/src/common/dout.h:155:58: note: in definition of macro 'dout_impl'
[ 270s] 155 | return (cctX->_conf->subsys.template should_gather<sub, v>()); \
[ 270s] | ^~~
[ 270s] /home/abuild/rpmbuild/BUILD/ceph-18.0.0-4135-g87cd54281c8/src/osd/osd_types.h:3618:3: note: in expansion of macro 'ldpp_dout'
[ 270s] 3618 | ldpp_dout(dpp, 10) << "build_prior all_probe " << all_probe << dendl;
[ 270s] | ^~~~~~~~~
[ 270s] /home/abuild/rpmbuild/BUILD/ceph-18.0.0-4135-g87cd54281c8/src/common/dout.h:51:20: note: 'virtual unsigned int DoutPrefixProvider::get_subsys() const' declared here
[ 270s] 51 | virtual unsigned get_subsys() const = 0;
[ 270s] | ^~~~~~~~~~
</pre>
<p>The gcc change is described at <a class="external" href="https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617196.html">https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617196.html</a>.</p>
<p>The ceph FTBFS was mentioned in a followup post at <a class="external" href="https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618384.html">https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618384.html</a>, and apparently this failure is now expected, as <code> DoutPrefixProvider::get_subsys()</code> isn't declared <code>constexpr</code> but really should be.</p>
<p>I tried to fix this experimentally by simply declaring <code>constexpr get_subsys()</code>, e.g.:</p>
<pre>
diff --git a/src/common/dout.h b/src/common/dout.h
index a1375fbb910..6e91750708a 100644
--- a/src/common/dout.h
+++ b/src/common/dout.h
@@ -61,7 +61,7 @@ class NoDoutPrefix : public DoutPrefixProvider {
std::ostream& gen_prefix(std::ostream& out) const override { return out; }
CephContext *get_cct() const override { return cct; }
- unsigned get_subsys() const override { return subsys; }
+ constexpr unsigned get_subsys() const override { return subsys; }
};
// a prefix provider with static (const char*) prefix
@@ -88,7 +88,7 @@ class DoutPrefixPipe : public DoutPrefixProvider {
return out;
}
CephContext *get_cct() const override { return dpp.get_cct(); }
- unsigned get_subsys() const override { return dpp.get_subsys(); }
+ constexpr unsigned get_subsys() const override { return dpp.get_subsys(); }
virtual void add_prefix(std::ostream& out) const = 0;
};
</pre>
<p>...but that has some problems:</p>
<p>1) Instead of an outright build failure, I get <code>warning: virtual functions cannot be 'constexpr' before C++20 [-Winvalid-constexpr]</code>. I imaging this is undesirable.<br />2) Even if 1 <em>is</em> desirable, there's plenty of other subclasses of <code>DoutPrefixProvider</code> which would all <em>also</em> need to have their <code>get_subsys()</code> methods declared <code>conxtexpr</code> for the build to complete.</p>
<p>TBH the whole <code>dout</code> thing is black magic to me, so I could really use some assistance with how best to fix this.</p> Ceph - Bug #58501 (Resolved): ceph.spec.in: need to replace SUSE usrmerged macro with version checkhttps://tracker.ceph.com/issues/585012023-01-19T07:23:05ZTim Serongtserong@suse.com
<p><a class="external" href="https://github.com/ceph/ceph/commit/e4c4a4ce97fff8a5b4efa747d9cffeabcceedd25">https://github.com/ceph/ceph/commit/e4c4a4ce97fff8a5b4efa747d9cffeabcceedd25</a> introduced the use of the <code>usrmerged</code> macro on SUSE distros to guard against installing the /sbin/mount.ceph symlink. This macro has since been deprecated and should be replaced with a version check instead (<code>%if 0%{?suse_version} < 1550</code>). See <a class="external" href="https://en.opensuse.org/openSUSE:Usr_merge">https://en.opensuse.org/openSUSE:Usr_merge</a> for more details.</p> Ceph - Bug #57967 (Resolved): ceph-crash service should run as unprivileged user, not root (CVE-2...https://tracker.ceph.com/issues/579672022-11-03T05:11:53ZTim Serongtserong@suse.com
<p>As reported at <a class="external" href="https://www.openwall.com/lists/oss-security/2022/10/25/1">https://www.openwall.com/lists/oss-security/2022/10/25/1</a>, ceph-crash runs as root, which makes it vulnerable to a potential ceph user to root privilege escalation. This is fixable by making the ceph-crash process drop privileges and run as the ceph user, just as the other ceph daemons do.</p> Ceph - Bug #57893 (Pending Backport): make-dist creates ceph.spec with incorrect Release tag for ...https://tracker.ceph.com/issues/578932022-10-19T08:04:36ZTim Serongtserong@suse.com
<p><code>ceph.spec.in</code> says:</p>
<pre>
Name: ceph
Version: @PROJECT_VERSION@
Release: @RPM_RELEASE@%{?dist}
%if 0%{?fedora} || 0%{?rhel}
Epoch: 2
%endif
</pre>
<p>When the <code>make-dist</code> script generates the final <code>ceph.spec</code> file for RPM builds, it will set PROJECT_VERSION to the version from the latest tag (e.g.: 17.0.0), and set RPM_RELEASE to the number of additional commits plus the last commit hash (e.g.: 14883.gc49b81c7d61). This doesn't work properly when building in SUSE's Open Build Service, because OBS overwrites the Release tag with checkin and build counters (see <a class="external" href="https://en.opensuse.org/openSUSE:Package_versioning_guidelines">https://en.opensuse.org/openSUSE:Package_versioning_guidelines</a>).</p>
<p>We've long carried a downstream patch for <code>make-dist</code> to fix this, by putting everything in PROJECT_VERSION, so you end up with something like <code>Version: 17.0.0.14883+gc49b81c7d61</code> (see <a class="external" href="https://github.com/SUSE/ceph/commit/9ee636cdca3">https://github.com/SUSE/ceph/commit/9ee636cdca3</a>), so I figure I should really submit that upstream.</p> Ceph - Bug #57860 (Pending Backport): disable system_pmdk on s390x for SUSE distroshttps://tracker.ceph.com/issues/578602022-10-13T04:28:31ZTim Serongtserong@suse.com
<p>Same as <a class="external" href="https://tracker.ceph.com/issues/56491">https://tracker.ceph.com/issues/56491</a> which addressed RHEL and Fedora not shipping libpmem on s390x, but for SUSE.</p> Ceph - Bug #57497 (Pending Backport): openSUSE Leap 15.x needs to explicitly specify gcc-11https://tracker.ceph.com/issues/574972022-09-12T01:06:36ZTim Serongtserong@suse.com
<p>This is a recurrence of <a class="external" href="https://tracker.ceph.com/issues/55237">https://tracker.ceph.com/issues/55237</a>. I wrote <a class="external" href="https://github.com/ceph/ceph/commit/80949babab4">https://github.com/ceph/ceph/commit/80949babab4</a> to use gcc-c++ >= 11 on SUSE distros, which works fine on Tumbleweed (our latest and greatest), but doesn't work on openSUSE Leap 15, which has gcc 11, but not packaged in a way in which that nice neat >= requirement works. So I need to re-instate part of <a class="external" href="https://github.com/ceph/ceph/pull/45845/commits/8ab5d7eea07">https://github.com/ceph/ceph/pull/45845/commits/8ab5d7eea07</a></p> Ceph - Bug #57390 (Pending Backport): denc-mod-osd.so: undefined symbol _ZN4ceph25ErasureCodePlug...https://tracker.ceph.com/issues/573902022-09-02T08:42:22ZTim Serongtserong@suse.com
<p>When running <code>ceph-dencoder</code> on openSUSE Tumbleweed (built with GCC 12 and LTO, in case that's relevant), I get the following failure:</p>
<pre>
# ceph-dencoder
failed to dlopen("/usr/lib64/ceph/denc/denc-mod-osd.so"): /usr/lib64/ceph/denc/denc-mod-osd.so: undefined symbol: _ZN4ceph25ErasureCodePluginRegistry9singletonE
-h for help
</pre>
<p>This is fixable by adding "erasure_code" to denc-mod-osd's target_link_libraries.</p> Ceph - Bug #56658 (Resolved): build: cephfs-shell fails to build/install with python setuptools >...https://tracker.ceph.com/issues/566582022-07-21T07:56:14ZTim Serongtserong@suse.com
<p>python setuptools v61 changed package discovery so that if it finds what it thinks are multiple top-level packages in a directory, it will fail to build. This was introduced by <a class="external" href="https://github.com/pypa/setuptools/pull/3177">https://github.com/pypa/setuptools/pull/3177</a>, and causes the ceph RPM build to fail with:</p>
<pre>
...
[ 9562s] error: Multiple top-level packages discovered in a flat-layout: ['top', 'CMakeFiles'].
[ 9562s]
[ 9562s] To avoid accidental inclusion of unwanted files or directories,
[ 9562s] setuptools will not proceed with this build.
[ 9562s]
[ 9562s] If you are trying to create a single distribution with multiple packages
[ 9562s] on purpose, you should not rely on automatic discovery.
[ 9562s] Instead, consider the following options:
[ 9562s]
[ 9562s] 1. set up custom discovery (`find` directive with `include` or `exclude`)
[ 9562s] 2. use a `src-layout`
[ 9562s] 3. explicitly set `py_modules` or `packages` with a list of names
[ 9562s]
[ 9562s] To find more information, look for "package discovery" on setuptools docs.
...
[ 9833s] RPM build errors:
[ 9833s] File not found: /home/abuild/rpmbuild/BUILDROOT/ceph-16.2.9.158+gd93952c7eea-2.3.x86_64/usr/lib/python3.10/site-packages/cephfs_shell-*.egg-info
[ 9833s] File not found: /home/abuild/rpmbuild/BUILDROOT/ceph-16.2.9.158+gd93952c7eea-2.3.x86_64/usr/bin/cephfs-shell
</pre>
<p>This has been fixed in Fedora downstream by moving a/src/tools/cephfs/cephfs-shell to a separate subdirectory (see <a class="external" href="https://src.fedoraproject.org/rpms/ceph/blob/rawhide/f/0021-cephfs-shell.patch">https://src.fedoraproject.org/rpms/ceph/blob/rawhide/f/0021-cephfs-shell.patch</a>). I've confirmed this approach also works for openSUSE.</p> Ceph - Bug #56466 (Resolved): pacific: boost 1.73.0 is incompatible with python 3.10https://tracker.ceph.com/issues/564662022-07-05T05:54:10ZTim Serongtserong@suse.com
<p>Ceph pacific includes boost 1.73.0, which uses the <code>_Py_fopen()</code> function, which is no longer available in python 3.10. This means it's not possible to build ceph pacific RPMs against python 3.10. Builds will fail with:</p>
<pre>[ 182s] libs/python/src/exec.cpp: In function 'boost::python::api::object boost::python::exec_file(const char*, api::object, api::object)':
[ 182s] libs/python/src/exec.cpp:109:14: error: '_Py_fopen' was not declared in this scope; did you mean '_Py_wfopen'?
[ 182s] 109 | FILE *fs = _Py_fopen(f, "r");
[ 182s] | ^~~~~~~~~
[ 182s] | _Py_wfopen
</pre>
<p>This is not a problem with quincy or newer, as those use boost 1.75.0, which includes a patch to switches to using fopen() for python versions >= 3.1.</p> Ceph - Bug #55237 (Resolved): rpm: openSUSE build fails - needs explicit gcc version, also can't ...https://tracker.ceph.com/issues/552372022-04-08T06:23:06ZTim Serongtserong@suse.com
<p>Two issues here which are strictly speaking unrelated, but I thought it'd be less annoying to just fix the openSUSE build with one bug.</p>
<p>Issue 1: openSUSE Leap 15.3 and 15.4 use gcc 7 by default, which is not new enough to build ceph. Both distros do provide gcc 11, but we have to explicitly request that version if we want to use it.</p>
<p>Issue 2: Parquet, which in turn requires Arrow, can't currently be built for openSUSE. The problem here is that we don't have those dependencies packaged as RPMs, and when trying to build Arrow out of the submodule in the ceph source tree, one of its dependencies (xsimd) tries to download source from the internet, which doesn't work in the openSUSE Build Service (build workers have no internet access).</p> Ceph - Bug #55087 (Resolved): rpm: openSUSE needs libthrift-devel, not thrift-develhttps://tracker.ceph.com/issues/550872022-03-28T09:42:57ZTim Serongtserong@suse.com
<p>In <a class="external" href="https://github.com/ceph/ceph/pull/38783">https://github.com/ceph/ceph/pull/38783</a>, <a class="external" href="https://github.com/ceph/ceph/pull/38783/commits/80e82686eba">https://github.com/ceph/ceph/pull/38783/commits/80e82686eba</a> added "thrift-devel >= 0.13.0" as a BuildRequires. On SUSE distros, this package is named libthrift-devel, so we need an <code>%if 0%{?suse_version}</code> block around that one.</p> Ceph - Bug #37503 (Resolved): Audit log: mgr module passwords set on CLI written as plaintext in ...https://tracker.ceph.com/issues/375032018-12-03T10:57:04ZTim Serongtserong@suse.com
<p>A number of mgr modules need passwords set for one reason or another, either to authenticate with external systems (deepsea, influx, diskprediction), or to define credentials for users of those modules (dashboard, restful).</p>
<p>In all cases, these passwords are set from the command line, either via module-specific commands (<code>`ceph dashboard ac-user-create`</code>, <code>`deepsea config-set salt_api_password`</code>, etc.) or via <code>`ceph config set`</code> with some particular key (e.g.: mgr/influx/passsword)</p>
<p>All module-specific commands go through <code>DaemonServer::_handle_command()</code>, which then logs the command via <code>audit_clog->debug()</code> (or <code>audit_clog->info()</code> in case of access denied). This all ends up written to <code>/var/log/ceph/ceph-mgr.$ID.log</code>, which is world-readable, e.g.:</p>
<pre>
2018-12-03 10:45:28.864 7f67e7f8f700 0 log_channel(audit) log [DBG] : from='client.343880 172.16.1.254:39896/3560370796' entity='client.admin' cmd=[{"prefix": "deepsea config-set", "key": "salt_api_password", "value": "foo", "target": ["mgr", ""]}]: dispatch
</pre>
<p>Additionally, anything that results in a "config set" lands in the mon log, e.g.:</p>
<pre>
2018-12-03 10:45:28.881552 [INF] from='mgr.295252 172.16.1.21:56636/175641' entity='mgr.data1' cmd='[{"prefix":"config set","who":"mgr","name":"mgr/deepsea/salt_api_password","value":"foo"}]': finished
</pre>
<p>This also appears in the Audit log in the Dashboard.</p>
<p>Some things that land in the mon log probably don't matter; for any module that hashes passwords before saving them, only the hashed password should land in the mon log. But there's still the problem of the CLI commands in the mgr log, and in any case, modules that need to authenticate with external services will need to store plaintext passwords.</p>
<p>ISTM we need to either never log these things, or somehow keep the command logging, but filter the passwords out, so it renders the value as "*****" instead of the actual password.</p>
<p>I'm not sure how best to approach this, given the way command logging is structured. At the point commands are logged, the commands themselves are just strings. Admittedly, they're strings of JSON, but they're effectively opaque at that point - we'd have to parse the JSON, then look for things that might be passwords, blank them out, and turn the whole lot back into a string. Yuck.</p> Ceph - Bug #35906 (Resolved): ceph-disk: is_mounted() returns None for mounted OSDs with Python 3https://tracker.ceph.com/issues/359062018-09-10T11:10:49ZTim Serongtserong@suse.com
<p>`ceph-disk list --format=json` on python 3 gives null for the mount member, even for mounted OSDs, e.g.:</p>
<pre>
# ceph-disk list --format=json|json_pp
...
{
"path" : "/dev/vdg",
"partitions" : [
{
"whoami" : "23",
"is_partition" : true,
"path" : "/dev/vdg1",
"ceph_fsid" : "00296336-7bf2-43f1-a48c-24c7212bf478",
"dmcrypt" : {},
"uuid" : "b447f027-f116-47d0-9cd1-ca2348e8e3db",
"block_uuid" : "dfaf6613-f958-497a-9dfb-ad343e897639",
"block_dev" : "/dev/vdg2",
"type" : "data",
*"mount" : null,*
"ptype" : "4fbd7e29-9d25-41b8-afd0-062c0ceff05d",
"magic" : "ceph osd volume v026",
"cluster" : "ceph",
"state" : "prepared",
"fs_type" : "xfs"
},
{
"type" : "block",
"is_partition" : true,
"path" : "/dev/vdg2",
"ptype" : "cafecafe-9b03-4f30-b4c6-b4b80ceff106",
"block_for" : "/dev/vdg1",
"dmcrypt" : {},
"uuid" : "dfaf6613-f958-497a-9dfb-ad343e897639"
}
]
}
...
</pre> Ceph - Bug #18163 (Resolved): platform.linux_distribution() is deprecated; stop using ithttps://tracker.ceph.com/issues/181632016-12-07T04:10:56ZTim Serongtserong@suse.com
<p>platform.linux_distribution() is deprecated, so we should stop using it. Notably it uses /etc/SuSE-release on SUSE systems, and the latest SUSE versions don't ship this file; instead they ship /etc/os-release, which platform.linux_distribution() doesn't know about, so it returns ('','','').</p>
<p>AFAICT, platform.linux_distribution() is currently used by ceph-detect-init, which in turn is used by ceph-disk. If ceph-detect-init can't determine the distro because it sees ('','',''), this results in ceph-disk always tagging the init system as sysvinit.</p>
<p>There are also platform.linux_distribution() invocations in qa/workunits/ceph-disk/ceph-disk-no-lockbox and src/ceph-disk/ceph_disk/main.py, but they look like dead code to me.</p>
<p>See also bug <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: platform.linux_distribution() is deprecated; stop using it (Resolved)" href="https://tracker.ceph.com/issues/18141">#18141</a></p> Ceph - Bug #14864 (Resolved): ceph-detect-init requires python-setuptools at runtimehttps://tracker.ceph.com/issues/148642016-02-25T14:58:49ZTim Serongtserong@suse.com
<p>Testing a reasonably recent ceph-10.0.2 on openSUSE Leap 42.1, my OSDs weren't mounting. I tracked this back to /usr/lib/systemd/system/ceph-disk@.service which invokes `flock /var/lock/ceph-disk /usr/sbin/ceph-disk --verbose --log-stdout trigger --sync %f`. This in turn results in:</p>
<pre>
ceph-disk: main_trigger: Namespace(dev='/dev/sdb1', func=<function main_trigger at 0x7fa6ebf6b050>, log_stdout=True, prepend_to_path='/usr/bin', prog='ceph-disk', statedir='/var/lib/ceph', sync=True, sysconfdir='/etc/ceph', verbose=True)
ceph-disk: Running command: /sbin/init --version
/sbin/init: unrecognized option '--version'
ceph-disk: get_dm_uuid /dev/sdb1 uuid path is /sys/dev/block/8:17/dm/uuid
ceph-disk: Running command: /usr/sbin/sgdisk -i 1 /dev/sdb
ceph-disk: get_dm_uuid /dev/sdb1 uuid path is /sys/dev/block/8:17/dm/uuid
ceph-disk: Running command: /usr/sbin/sgdisk -i 1 /dev/sdb
ceph-disk: trigger /dev/sdb1 parttype 4fbd7e29-9d25-41b8-afd0-062c0ceff05d uuid 93b72ed5-7d84-4b0b-a227-330fcd22513e
ceph-disk: Running command: /usr/sbin/ceph-disk activate /dev/sdb1
Traceback (most recent call last):
File "/usr/bin/ceph-detect-init", line 5, in <module>
from pkg_resources import load_entry_point
ImportError: No module named pkg_resources
ERROR:ceph-disk:Failed to activate
Traceback (most recent call last):
File "/usr/sbin/ceph-disk", line 4036, in <module>
main(sys.argv[1:])
File "/usr/sbin/ceph-disk", line 3992, in main
main_catch(args.func, args)
File "/usr/sbin/ceph-disk", line 4014, in main_catch
func(args)
File "/usr/sbin/ceph-disk", line 2530, in main_activate
reactivate=args.reactivate,
File "/usr/sbin/ceph-disk", line 2296, in mount_activate
(osd_id, cluster) = activate(path, activate_key_template, init)
File "/usr/sbin/ceph-disk", line 2477, in activate
init = init_get()
File "/usr/sbin/ceph-disk", line 799, in init_get
'--default', 'sysvinit',
File "/usr/sbin/ceph-disk", line 902, in _check_output
raise error
subprocess.CalledProcessError: Command '/usr/bin/ceph-detect-init' returned non-zero exit status 1
</pre>
<p>The important part is:</p>
<pre>
Traceback (most recent call last):
File "/usr/bin/ceph-detect-init", line 5, in <module>
from pkg_resources import load_entry_point
ImportError: No module named pkg_resources
</pre>
<p>This is fixable by installing python-setuptools, suggesting that package needs to be added to the RPM Requires and, I assume, the Debian Depends.</p>