Project

General

Profile

Bug #17091

ceph-osd-prestart.sh fails confusingly when data directory does not exist

Added by Nathan Cutler about 1 year ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
08/22/2016
Due date:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No

Description

David Disseldorp writes:

After rebooting a cluster node, I noticed the following errors in dmesg:

2016-08-03T13:39:47.919203-04:00 teuthida-1 ceph-osd-prestart.sh[41168]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-82/.’: No such file or directory
2016-08-03T13:39:47.919460-04:00 teuthida-1 ceph-osd-prestart.sh[41168]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
2016-08-03T13:39:47.919826-04:00 teuthida-1 ceph-osd-prestart.sh[41177]: Error connecting to cluster: ObjectNotFound
2016-08-03T13:39:47.920449-04:00 teuthida-1 ceph-osd-prestart.sh[41189]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-77/.’: No such file or directory
2016-08-03T13:39:47.921214-04:00 teuthida-1 ceph-osd-prestart.sh[41189]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
2016-08-03T13:39:47.921560-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: 2016-08-03 13:39:47.921430 7f142b7db700 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-80/keyring: (2) No such file or directory
2016-08-03T13:39:47.921616-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: 2016-08-03 13:39:47.921449 7f142b7db700 -1 monclient(hunting): ERROR: missing keyring, cannot use cephx for authentication
2016-08-03T13:39:47.921654-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: 2016-08-03 13:39:47.921452 7f142b7db700  0 librados: osd.80 initialization error (2) No such file or directory
2016-08-03T13:39:47.923921-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: Error connecting to cluster: ObjectNotFound
2016-08-03T13:39:47.928441-04:00 teuthida-1 ceph-osd-prestart.sh[41163]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-73/.’: No such file or directory
2016-08-03T13:39:47.928644-04:00 teuthida-1 ceph-osd-prestart.sh[41163]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
2016-08-03T13:39:47.929762-04:00 teuthida-1 ceph-osd-prestart.sh[41167]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-83/.’: No such file or directory
2016-08-03T13:39:47.930063-04:00 teuthida-1 ceph-osd-prestart.sh[41167]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
2016-08-03T13:39:47.931070-04:00 teuthida-1 ceph-osd-prestart.sh[41166]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-72/.’: No such file or directory
2016-08-03T13:39:47.931281-04:00 teuthida-1 ceph-osd-prestart.sh[41186]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-78/.’: No such file or directory
2016-08-03T13:39:47.931385-04:00 teuthida-1 ceph-osd-prestart.sh[41166]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
2016-08-03T13:39:47.931566-04:00 teuthida-1 ceph-osd-prestart.sh[41186]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
2016-08-03T13:39:47.936615-04:00 teuthida-1 ceph-osd-prestart.sh[41177]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-79/.’: No such file or directory
2016-08-03T13:39:47.936822-04:00 teuthida-1 ceph-osd-prestart.sh[41169]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-75/.’: No such file or directory
2016-08-03T13:39:47.936874-04:00 teuthida-1 ceph-osd-prestart.sh[41177]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
2016-08-03T13:39:47.937066-04:00 teuthida-1 ceph-osd-prestart.sh[41169]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments
2016-08-03T13:39:47.941347-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: stat: cannot stat ‘/var/lib/ceph/osd/ceph-80/.’: No such file or directory
2016-08-03T13:39:47.941562-04:00 teuthida-1 ceph-osd-prestart.sh[41190]: /usr/lib/ceph/ceph-osd-prestart.sh: line 57: [: too many arguments

These errors are thrown from:

 55 # ensure ownership is correct
 56 owner=`stat -c %U $data/.`
 57 if [ $owner != 'ceph' -a $owner != 'root' ]; then
 58     echo "ceph-osd data dir $data is not owned by 'ceph' or 'root'" 
 59     echo "you must 'chown -R ceph:ceph ...' or similar to fix ownership" 
 60     exit 1
 61 fi

where $data corresponds to a non-existent directory.

The script is invoked with an invalid osd id, due to a bunch of stale systemd services which haven't been cleaned up following osd removal:

:ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i
:ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i
...

Regardless of the incorrect config, this script should still fail gracefully, rather than throwing a syntax error.


Related issues

Copied to devops - Backport #17094: jewel: ceph-osd-prestart.sh fails confusingly when data directory does not exist Resolved

History

#1 Updated by Nathan Cutler about 1 year ago

  • Status changed from New to Need Review
  • Backport set to jewel

#2 Updated by Nathan Cutler about 1 year ago

  • Status changed from Need Review to Pending Backport

#3 Updated by Nathan Cutler about 1 year ago

  • Copied to Backport #17094: jewel: ceph-osd-prestart.sh fails confusingly when data directory does not exist added

#4 Updated by Nathan Cutler 12 months ago

  • Status changed from Pending Backport to Resolved
  • Needs Doc set to No

#5 Updated by Dan Mick 4 months ago

Also available in: Atom PDF