Actions
Bug #21941
closedmgr systemd startup issue
Status:
Rejected
Priority:
Low
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
1) I am running a test that uses systed to start daemon's instead of teuthology's old way of daemon-helper
cluster config:
2017-10-26T22:40:10.438 INFO:teuthology.task.internal:roles: ubuntu@ovh031.front.sepia.ceph.com - ['mon.a', 'mds.a', 'mgr.x', 'osd.0', 'osd.1'] 2017-10-26T22:40:10.438 INFO:teuthology.task.internal:roles: ubuntu@ovh001.front.sepia.ceph.com - ['mon.b', 'mgr.y', 'osd.2', 'osd.3'] 2017-10-26T22:40:10.439 INFO:teuthology.task.internal:roles: ubuntu@ovh092.front.sepia.ceph.com - ['mds.b', 'osd.4', 'osd.5'] 2017-10-26T22:40:10.439 INFO:teuthology.task.internal:roles: ubuntu@ovh013.front.sepia.ceph.com - ['mon.c', 'mds.s'] 2017-10-26T22:40:10.439 INFO:teuthology.task.internal:roles: ubuntu@ovh077.front.sepia.ceph.com - ['client.0']
2) The ceph code here starts all the daemon's and the status is shown as loaded/running.
2017-10-26T23:04:29.661 INFO:tasks.ceph.mon.a:Restarting daemon using systemd 2017-10-26T23:04:29.661 INFO:teuthology.orchestra.run.ovh031:Running: 'ps -ef | grep "ceph-mon.*--id a" | grep -v grep | awk {\'print $2\'}' 2017-10-26T23:04:29.823 INFO:tasks.ceph.mon.a:starting a non-running daemon 2017-10-26T23:04:29.823 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo systemctl start ceph-mon@a' 2017-10-26T23:04:30.072 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo systemctl show ceph-mon@a | grep -i state' 2017-10-26T23:04:30.201 INFO:teuthology.orchestra.run.ovh031.stdout:LoadState=loaded 2017-10-26T23:04:30.201 INFO:teuthology.orchestra.run.ovh031.stdout:ActiveState=active 2017-10-26T23:04:30.201 INFO:teuthology.orchestra.run.ovh031.stdout:SubState=running 2017-10-26T23:04:30.201 INFO:teuthology.orchestra.run.ovh031.stdout:UnitFileState=disabled 2017-10-26T23:04:30.202 INFO:teuthology.orchestra.run.ovh031.stdout:StateChangeTimestamp=Thu 2017-10-26 23:04:29 UTC 2017-10-26T23:04:30.202 INFO:teuthology.orchestra.run.ovh031.stdout:StateChangeTimestampMonotonic=1285209053 2017-10-26T23:04:30.202 INFO:teuthology.orchestra.run.ovh001:Running: 'which systemctl' 2017-10-26T23:04:30.310 INFO:teuthology.orchestra.run.ovh001.stdout:/bin/systemctl 2017-10-26T23:04:30.311 INFO:tasks.ceph.mon.b:Restarting daemon using systemd 2017-10-26T23:04:30.311 INFO:teuthology.orchestra.run.ovh001:Running: 'ps -ef | grep "ceph-mon.*--id b" | grep -v grep | awk {\'print $2\'}' 2017-10-26T23:04:30.502 INFO:tasks.ceph.mon.b:starting a non-running daemon 2017-10-26T23:04:30.503 INFO:teuthology.orchestra.run.ovh001:Running: 'sudo systemctl start ceph-mon@b' 2017-10-26T23:04:30.802 INFO:teuthology.orchestra.run.ovh001:Running: 'sudo systemctl show ceph-mon@b | grep -i state' 2017-10-26T23:04:30.974 INFO:teuthology.orchestra.run.ovh001.stdout:LoadState=loaded 2017-10-26T23:04:30.975 INFO:teuthology.orchestra.run.ovh001.stdout:ActiveState=active 2017-10-26T23:04:30.975 INFO:teuthology.orchestra.run.ovh001.stdout:SubState=running 2017-10-26T23:04:30.975 INFO:teuthology.orchestra.run.ovh001.stdout:UnitFileState=disabled 2017-10-26T23:04:30.975 INFO:teuthology.orchestra.run.ovh001.stdout:StateChangeTimestamp=Thu 2017-10-26 23:04:30 UTC 2017-10-26T23:04:30.975 INFO:teuthology.orchestra.run.ovh001.stdout:StateChangeTimestampMonotonic=1245133931 2017-10-26T23:04:30.976 INFO:teuthology.orchestra.run.ovh013:Running: 'which systemctl' 2017-10-26T23:04:31.082 INFO:teuthology.orchestra.run.ovh013.stdout:/bin/systemctl 2017-10-26T23:04:31.083 INFO:tasks.ceph.mon.c:Restarting daemon using systemd 2017-10-26T23:04:31.083 INFO:teuthology.orchestra.run.ovh013:Running: 'ps -ef | grep "ceph-mon.*--id c" | grep -v grep | awk {\'print $2\'}' 2017-10-26T23:04:31.261 INFO:tasks.ceph.mon.c:starting a non-running daemon 2017-10-26T23:04:31.262 INFO:teuthology.orchestra.run.ovh013:Running: 'sudo systemctl start ceph-mon@c' 2017-10-26T23:04:31.360 INFO:teuthology.orchestra.run.ovh001:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf' 2017-10-26T23:04:31.462 INFO:teuthology.orchestra.run.ovh013:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf' 2017-10-26T23:04:31.508 INFO:teuthology.orchestra.run.ovh013:Running: 'sudo systemctl show ceph-mon@c | grep -i state' 2017-10-26T23:04:31.584 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf' 2017-10-26T23:04:31.649 INFO:teuthology.orchestra.run.ovh013.stdout:LoadState=loaded 2017-10-26T23:04:31.650 INFO:teuthology.orchestra.run.ovh013.stdout:ActiveState=active 2017-10-26T23:04:31.650 INFO:teuthology.orchestra.run.ovh013.stdout:SubState=running 2017-10-26T23:04:31.650 INFO:teuthology.orchestra.run.ovh013.stdout:UnitFileState=disabled 2017-10-26T23:04:31.650 INFO:teuthology.orchestra.run.ovh013.stdout:StateChangeTimestamp=Thu 2017-10-26 23:04:31 UTC 2017-10-26T23:04:31.650 INFO:teuthology.orchestra.run.ovh013.stdout:StateChangeTimestampMonotonic=1265005766 2017-10-26T23:04:31.650 INFO:tasks.ceph:Starting mgr daemons in cluster ceph... 2017-10-26T23:04:31.651 INFO:tasks.ceph.mgr.x:Restarting daemon using systemd 2017-10-26T23:04:31.651 INFO:teuthology.orchestra.run.ovh031:Running: 'ps -ef | grep "ceph-mgr.*--id x" | grep -v grep | awk {\'print $2\'}' 2017-10-26T23:04:31.685 INFO:teuthology.orchestra.run.ovh077:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf' 2017-10-26T23:04:31.760 INFO:teuthology.orchestra.run.ovh092:Running: 'sudo logrotate /etc/logrotate.d/ceph-test.conf' 2017-10-26T23:04:31.851 INFO:tasks.ceph.mgr.x:starting a non-running daemon 2017-10-26T23:04:31.852 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo systemctl start ceph-mgr@x' 2017-10-26T23:04:32.018 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo systemctl show ceph-mgr@x | grep -i state' 2017-10-26T23:04:32.190 INFO:teuthology.orchestra.run.ovh031.stdout:LoadState=loaded 2017-10-26T23:04:32.190 INFO:teuthology.orchestra.run.ovh031.stdout:ActiveState=active 2017-10-26T23:04:32.190 INFO:teuthology.orchestra.run.ovh031.stdout:SubState=running 2017-10-26T23:04:32.190 INFO:teuthology.orchestra.run.ovh031.stdout:UnitFileState=disabled 2017-10-26T23:04:32.190 INFO:teuthology.orchestra.run.ovh031.stdout:StateChangeTimestamp=Thu 2017-10-26 23:04:32 UTC 2017-10-26T23:04:32.190 INFO:teuthology.orchestra.run.ovh031.stdout:StateChangeTimestampMonotonic=1287403608 2017-10-26T23:04:32.191 INFO:tasks.ceph.mgr.y:Restarting daemon using systemd 2017-10-26T23:04:32.191 INFO:teuthology.orchestra.run.ovh001:Running: 'ps -ef | grep "ceph-mgr.*--id y" | grep -v grep | awk {\'print $2\'}' 2017-10-26T23:04:32.296 INFO:tasks.ceph.mgr.y:starting a non-running daemon 2017-10-26T23:04:32.296 INFO:teuthology.orchestra.run.ovh001:Running: 'sudo systemctl start ceph-mgr@y' 2017-10-26T23:04:32.457 INFO:teuthology.orchestra.run.ovh001:Running: 'sudo systemctl show ceph-mgr@y | grep -i state' 2017-10-26T23:04:32.621 INFO:teuthology.orchestra.run.ovh001.stdout:LoadState=loaded 2017-10-26T23:04:32.622 INFO:teuthology.orchestra.run.ovh001.stdout:ActiveState=active 2017-10-26T23:04:32.622 INFO:teuthology.orchestra.run.ovh001.stdout:SubState=running 2017-10-26T23:04:32.622 INFO:teuthology.orchestra.run.ovh001.stdout:UnitFileState=disabled 2017-10-26T23:04:32.622 INFO:teuthology.orchestra.run.ovh001.stdout:StateChangeTimestamp=Thu 2017-10-26 23:04:32 UTC 2017-10-26T23:04:32.622 INFO:teuthology.orchestra.run.ovh001.stdout:StateChangeTimestampMonotonic=1247047407 2017-10-26T23:04:32.622 INFO:tasks.ceph:Setting crush tunables to default
3. In the above point you can see the status from mgr is running, but the query shows no active mgr
2017-10-26T23:05:11.332 INFO:tasks.ceph.ceph_manager.ceph:waiting for mgr available 2017-10-26T23:05:11.333 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mgr dump --format=json' 2017-10-26T23:05:11.781 INFO:teuthology.orchestra.run.ovh031.stdout: 2017-10-26T23:05:11.781 INFO:teuthology.orchestra.run.ovh031.stdout:{"epoch":8,"active_gid":0,"active_name":"","active_addr":"-","available":false,"standbys":[],"modules":["restful","status"],"available_modules":[]} 2017-10-26T23:05:14.782 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mgr dump --format=json' 2017-10-26T23:05:15.220 INFO:teuthology.orchestra.run.ovh031.stdout: 2017-10-26T23:05:15.220 INFO:teuthology.orchestra.run.ovh031.stdout:{"epoch":8,"active_gid":0,"active_name":"","active_addr":"-","available":false,"standbys":[],"modules":["restful","status"],"available_modules":[]} 2017-10-26T23:05:18.222 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mgr dump --format=json' 2017-10-26T23:05:18.586 INFO:teuthology.orchestra.run.ovh031.stdout: 2017-10-26T23:05:18.586 INFO:teuthology.orchestra.run.ovh031.stdout:{"epoch":9,"active_gid":0,"active_name":"","active_addr":"-","available":false,"standbys":[],"modules":["restful","status"],"available_modules":[]} 2017-10-26T23:05:21.587 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mgr dump --format=json' 2017-10-26T23:05:21.983 INFO:teuthology.orchestra.run.ovh031.stdout: 2017-10-26T23:05:21.983 INFO:teuthology.orchestra.run.ovh031.stdout:{"epoch":10,"active_gid":0,"active_name":"","active_addr":"-","available":false,"standbys":[],"modules":["restful","status"],"available_modules":[]} 2017-10-26T23:05:24.984 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mgr dump --format=json' 2017-10-26T23:05:25.459 INFO:teuthology.orchestra.run.ovh031.stdout: 2017-10-26T23:05:25.459 INFO:teuthology.orchestra.run.ovh031.stdout:{"epoch":10,"active_gid":0,"active_name":"","active_addr":"-","available":false,"standbys":[],"modules":["restful","status"],"available_modules":[]} 2017-10-26T23:05:28.468 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mgr dump --format=json' 2017-10-26T23:05:28.869 INFO:teuthology.orchestra.run.ovh031.stdout: 2017-10-26T23:05:28.869 INFO:teuthology.orchestra.run.ovh031.stdout:{"epoch":11,"active_gid":0,"active_name":"","active_addr":"-","available":false,"standbys":[],"modules":["restful","status"],"available_modules":[]} 2017-10-26T23:05:31.870 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph mgr dump --format=json' 2017-10-26T23:05:32.297 INFO:teuthology.orchestra.run.ovh031.stdout: 2017-10-26T23:05:32.298 INFO:teuthology.orchestra.run.ovh031.stdout:{"epoch":12,"active_gid":0,"active_name":"","active_addr":"-","available":false,"standbys":[],"modules":["restful","status"],"available_modules":[]}
All the other daemons are behaving nicely except the mgr, please let me know if its a mgr issue.
other daemons:
2017-10-26T23:04:55.735 DEBUG:teuthology.misc:6 of 6 OSDs are up 2017-10-26T23:04:55.735 INFO:tasks.ceph:Creating RBD pool 2017-10-26T23:04:55.735 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo ceph --cluster ceph osd pool create rbd 8' 2017-10-26T23:04:56.846 INFO:teuthology.orchestra.run.ovh031.stderr:pool 'rbd' created 2017-10-26T23:04:56.876 INFO:teuthology.orchestra.run.ovh031:Running: 'sudo ceph --cluster ceph osd pool application enable rbd rbd --yes-i-really-mean-it' 2017-10-26T23:04:58.072 INFO:teuthology.orchestra.run.ovh031.stderr:enabled application 'rbd' on pool 'rbd' 2017-10-26T23:04:58.093 INFO:tasks.ceph:Setting up CephFS filesystem...
Updated by Vasu Kulkarni over 6 years ago
Checking the logs looks like probably something that needs to be fixed in ceph.py, checking further on that
2017-10-26 23:04:32.466 7fd7f44cc8c0 -1 auth: unable to find a keyring on /var/lib/ceph/mgr/ceph-y/keyring: (13) Permission denied
Updated by Vasu Kulkarni over 6 years ago
- Status changed from New to Rejected
was mgr/keyring/dir permission issue
Actions