Bug #55664
closedcephadm: cephadm user/home removed during RPM upgrade
0%
Description
The cephadm RPM package maintains the following files:
/var/lib/cephadm/.ssh/authorized_keys
However, upgrading the RPM on any non-suse platform results in the removal of the cephadm user, and wipes the home directory removing the ssh auth file and causing the node to appear offline in the orchestrator, eg:
# rpm -i cephadm-16.2.7-1.fc35.noarch.rpm # ls -l /var/lib/cephadm/.ssh -rw-------. 1 cephadm cephadm 0 May 15 15:10 authorized_keys # id cephadm uid=477(cephadm) gid=475(cephadm) groups=475(cephadm) # rpm -U cephadm-16.2.7-3.fc35.noarch.rpm userdel: cephadm mail spool (/var/spool/mail/cephadm) not found userdel: error removing directory /var/lib/cephadm # ls -l /var/lib/cephadm/.ssh ls: cannot access '/var/lib/cephadm/.ssh': No such file or directory # id cephadm id: ‘cephadm’: no such user
The problem is a missing test in the %postun -n cephadm macro to test for upgrade (not removal), here's a simple fix:
%if ! 0%{?suse_version}
%postun -n cephadm
-userdel -r cephadm || true
-exit 0
+[ $1 -ne 0 ] || userdel cephadm || :
%endif
I removed the userdel "-r" flag as well since if the key file has been modified, it will be retained as a .rpmsave file, but if not then the /var/lib/cephadm directory is removed by rpm.
I'll create a pull request for this... note: since the bug is in the %postun, the first upgrade with the fix will still run the old %postun script from the previous version and remove the user (but that's unavoidable...)
To work correctly (retain the ssh keys), the fix also requires the patch in bug #54530 which marks the key file as %config(noreplace)