Bug #57750
opencephadm fails to upgrade systems not running sudo
0%
Description
cephadm fails to upgrade systems not running sudo.
It appears to have started with this commit:
https://github.com/ceph/ceph/commit/4c84e71f4350e9fd7c12944491b99eff02e8dbb2
Quick and dirty fix was of course to install sudo on all hosts in the cluster.
Error output from 'ceph -W cephadm' during upgrade is as follows:
2022-09-29T08:41:49.590515+0200 mgr.censored.pvswyw [ERR] executing refresh((['censored', 'censored', 'censored', 'censored', 'censored', 'censored', 'censored', 'censored', 'censored'],)) failed. Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/ssh.py", line 143, in _execute_command r = await conn.run('sudo true', check=True, timeout=5) File "/lib/python3.6/site-packages/asyncssh/connection.py", line 3637, in run return await process.wait(check, timeout) File "/lib/python3.6/site-packages/asyncssh/process.py", line 1257, in wait self.returncode, stdout_data, stderr_data) asyncssh.process.ProcessError: Process exited with non-zero exit status 127 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/utils.py", line 78, in do_work return f(*arg) File "/usr/share/ceph/mgr/cephadm/serve.py", line 266, in refresh self._write_client_files(client_files, host) File "/usr/share/ceph/mgr/cephadm/serve.py", line 1070, in _write_client_files self.mgr.ssh.check_execute_command(host, cmd) File "/usr/share/ceph/mgr/cephadm/ssh.py", line 196, in check_execute_command return self.mgr.wait_async(self._check_execute_command(host, cmd, stdin, addr)) File "/usr/share/ceph/mgr/cephadm/module.py", line 590, in wait_async return self.event_loop.get_result(coro) File "/usr/share/ceph/mgr/cephadm/ssh.py", line 48, in get_result return asyncio.run_coroutine_threadsafe(coro, self._loop).result() File "/lib64/python3.6/concurrent/futures/_base.py", line 432, in result return self.__get_result() File "/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result raise self._exception File "/usr/share/ceph/mgr/cephadm/ssh.py", line 183, in _check_execute_command out, err, code = await self._execute_command(host, cmd, stdin, addr) File "/usr/share/ceph/mgr/cephadm/ssh.py", line 151, in _execute_command raise OrchestratorError(f'Unable to reach remote host {host}. {str(e)}') orchestrator._interface.OrchestratorError: Unable to reach remote host censored. Process exited with non-zero exit status 127
Updated by Redouane Kachach Elhichou over 1 year ago
- Status changed from New to Need More Info
I'd say that the expected behavior. The user you use with cephadm needs passwordless sudo access to all the hosts that forms the cluster:
https://docs.ceph.com/en/latest/cephadm/install/#further-information-about-cephadm-bootstrap
Updated by Marcus Nordenberg over 1 year ago
The documentation does not state that one need sudo at all. It's an option. So one cannot make the assumption that everyone takes this route. Using the root user directly is a valid choice.
The manual clearly shows a bootstrap using the root user directly, no sudo involved.
This particular cluster was on a Debian OS which has a long history of not using sudo by default.
Updated by Adam King over 1 year ago
- Status changed from Need More Info to In Progress
- Assignee set to Adam King
this is a legit bug. However, I think this should have been fixed by https://github.com/ceph/ceph/pull/47898 which appears not to have made the 17.2.4 release.
Updated by Adam King over 1 year ago
assuming this wasn't with a custom (not-root) ssh user anyway, in which case sudo would be required as cephadm needs sudo privileges to run its commands
Updated by Marcus Nordenberg over 1 year ago
I can confirm that this was with the root user, no custom user involved.
Upgrades prior to 17.x.x worked like a charm.
Thanks for the feedback!