Actions
Bug #10389
closedinit-ceph stop may return before daemons are stopped
% Done:
80%
Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
The existence of the pidfile must be checked outside of the loop to send a signal to the daemon. Otherwise the daemon will remove the pidfile and stop can return before the process is dead because it only checks /proc/$pid if the pidfile exists.
Reproduced under load with ceph_objectstore_tool.py
=== mon.a === Stopping Ceph mon.a on fold...kill 20668...done === osd.3 === Stopping Ceph osd.3 on fold...kill 30277...kill 30277...done === osd.2 === Stopping Ceph osd.2 on fold...kill 26469...kill 26469...done === osd.1 === Stopping Ceph osd.1 on fold...kill 24615...kill 24615...done === osd.0 === Stopping Ceph osd.0 on fold...kill 22584...kill 22584...done DEBUG:['1.0', '1.1', '1.2', '1.3'] DEBUG:['2.0s0', '2.0s1', '2.0s2', '2.1s0', '2.1s1', '2.1s2', '2.2s0', '2.2s1', '2.2s2', '2.3s0', '2.3s1', '2.3s2'] DEBUG:['1.0', '1.1'] DEBUG:['2.0s0', '2.0s1', '2.0s2', '2.1s0', '2.1s1', '2.1s2'] DEBUG:1.0 DEBUG:osd0 Test invalid parameters DEBUG:./ceph-objectstore-tool --data-path ceph_objectstore_tool_dir/dev/osd0 --journal-path ceph_objectstore_tool_dir/dev/osd0.journal --op export --pgid 1.0 INFO:Correctly failed with message "stdout is a tty and no --file filename specified" DEBUG:./ceph-objectstore-tool --data-path ceph_objectstore_tool_dir/dev/osd0 --journal-path ceph_objectstore_tool_dir/dev/osd0.journal --op export --pgid 1.0 --file - INFO:Correctly failed with message "stdout is a tty and no --file filename specified" DEBUG:./ceph-objectstore-tool --data-path ceph_objectstore_tool_dir/dev/osd0 --journal-path ceph_objectstore_tool_dir/dev/osd0.journal --op import --pgid 1.0 --file /tmp/foo.20277 ERROR:Bad message to stderr "OSD has the store locked "
Updated by Loïc Dachary over 9 years ago
- Status changed from 12 to Fix Under Review
- % Done changed from 0 to 80
Updated by Loïc Dachary over 9 years ago
- Status changed from Fix Under Review to Resolved
Actions