Actions
Bug #17091
closedceph-osd-prestart.sh fails confusingly when data directory does not exist
% Done:
0%
Source:
Community (dev)
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
David Disseldorp writes:
After rebooting a cluster node, I noticed the following errors in dmesg:
2016-08-03T13:39:47.919203-04:00 teuthida-1 ceph-osd-prestart.sh[41168]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-82/.’: No such file or directory 2016-08-03T13:39:47.919460-04:00 teuthida-1 ceph-osd-prestart.sh[41168]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments 2016-08-03T13:39:47.919826-04:00 teuthida-1 ceph-osd-prestart.sh[41177]: Error connecting to cluster: ObjectNotFound 2016-08-03T13:39:47.920449-04:00 teuthida-1 ceph-osd-prestart.sh[41189]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-77/.’: No such file or directory 2016-08-03T13:39:47.921214-04:00 teuthida-1 ceph-osd-prestart.sh[41189]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments 2016-08-03T13:39:47.921560-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: 2016-08-03 13:39:47.921430 7f142b7db700 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-80/keyring: (2) No such file or directory 2016-08-03T13:39:47.921616-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: 2016-08-03 13:39:47.921449 7f142b7db700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication 2016-08-03T13:39:47.921654-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: 2016-08-03 13:39:47.921452 7f142b7db700 0 librados: osd.80 initialization error (2) No such file or directory 2016-08-03T13:39:47.923921-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: Error connecting to cluster: ObjectNotFound 2016-08-03T13:39:47.928441-04:00 teuthida-1 ceph-osd-prestart.sh[41163]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-73/.’: No such file or directory 2016-08-03T13:39:47.928644-04:00 teuthida-1 ceph-osd-prestart.sh[41163]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments 2016-08-03T13:39:47.929762-04:00 teuthida-1 ceph-osd-prestart.sh[41167]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-83/.’: No such file or directory 2016-08-03T13:39:47.930063-04:00 teuthida-1 ceph-osd-prestart.sh[41167]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments 2016-08-03T13:39:47.931070-04:00 teuthida-1 ceph-osd-prestart.sh[41166]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-72/.’: No such file or directory 2016-08-03T13:39:47.931281-04:00 teuthida-1 ceph-osd-prestart.sh[41186]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-78/.’: No such file or directory 2016-08-03T13:39:47.931385-04:00 teuthida-1 ceph-osd-prestart.sh[41166]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments 2016-08-03T13:39:47.931566-04:00 teuthida-1 ceph-osd-prestart.sh[41186]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments 2016-08-03T13:39:47.936615-04:00 teuthida-1 ceph-osd-prestart.sh[41177]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-79/.’: No such file or directory 2016-08-03T13:39:47.936822-04:00 teuthida-1 ceph-osd-prestart.sh[41169]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-75/.’: No such file or directory 2016-08-03T13:39:47.936874-04:00 teuthida-1 ceph-osd-prestart.sh[41177]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments 2016-08-03T13:39:47.937066-04:00 teuthida-1 ceph-osd-prestart.sh[41169]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments 2016-08-03T13:39:47.941347-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-80/.’: No such file or directory 2016-08-03T13:39:47.941562-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
These errors are thrown from:
55 # ensure ownership is correct 56 owner=`stat -c %U $data/.` 57 if [ $owner != 'ceph' -a $owner != 'root' ]; then 58 echo "ceph-osd data dir $data is not owned by 'ceph' or 'root'" 59 echo "you must 'chown -R ceph:ceph ...' or similar to fix ownership" 60 exit 1 61 fi
where $data corresponds to a non-existent directory.
The script is invoked with an invalid osd id, due to a bunch of stale systemd services which haven't been cleaned up following osd removal:
/etc/systemd/system/ceph-osd.target.wants/ceph-osd@72.service:ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i
/etc/systemd/system/ceph-osd.target.wants/ceph-osd@73.service:ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i
...
Regardless of the incorrect config, this script should still fail gracefully, rather than throwing a syntax error.
Actions