Bug #1997
teuthology: wait for clean osd shutdown before umount
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
/a/master-2012-01-27_13:29:47/9361
2012-01-27T16:23:52.280 INFO:teuthology.task.ceph:Shutting down osd daemons... 2012-01-27T16:23:52.280 INFO:teuthology.task.ceph:Shutting down mon daemons... 2012-01-27T16:23:52.282 INFO:teuthology.task.ceph.mon.a.err:*** Caught signal (Terminated) ** 2012-01-27T16:23:52.282 INFO:teuthology.task.ceph.mon.a.err: in thread 7f8993043780. Shutting down. 2012-01-27T16:23:52.295 INFO:teuthology.task.ceph.mon.a:mon.a: Stopped 2012-01-27T16:23:52.328 INFO:teuthology.task.ceph.mon.c.err:*** Caught signal (Terminated) ** 2012-01-27T16:23:52.328 INFO:teuthology.task.ceph.mon.c.err: in thread 7f0cbd3cc780. Shutting down. 2012-01-27T16:23:52.341 INFO:teuthology.task.ceph.mon.c:mon.c: Stopped 2012-01-27T16:23:52.342 INFO:teuthology.task.ceph.mon.b.err:*** Caught signal (Terminated) ** 2012-01-27T16:23:52.343 INFO:teuthology.task.ceph.mon.b.err: in thread 7fb9c5f48780. Shutting down. 2012-01-27T16:23:52.363 INFO:teuthology.task.ceph.mon.b:mon.b: Stopped 2012-01-27T16:23:52.363 INFO:teuthology.task.ceph:Grabbing cluster log from ubuntu@sepia5.ceph.dreamhost.com mon.a... 2012-01-27T16:23:52.363 DEBUG:teuthology.orchestra.run:Running: 'cat -- /tmp/cephtest/data/mon.a/log' 2012-01-27T16:23:52.385 INFO:teuthology.task.ceph:Checking cluster ceph.log for badness... 2012-01-27T16:23:52.385 DEBUG:teuthology.orchestra.run:Running: "egrep '\\[ERR\\]|\\[WRN\\]|\\[SEC\\]' /tmp/cephtest/data/mon.a/log | egrep -v 'clocks not synchronized' | head -n 1" 2012-01-27T16:23:52.428 INFO:teuthology.task.ceph:Unmounting /tmp/cephtest/data/osd.0.data on ubuntu@sepia5.ceph.dreamhost.com 2012-01-27T16:23:52.428 DEBUG:teuthology.orchestra.run:Running: 'sudo umount -f /tmp/cephtest/data/osd.0.data' 2012-01-27T16:23:52.491 INFO:teuthology.orchestra.run.err:umount2: Device or resource busy 2012-01-27T16:23:52.492 INFO:teuthology.orchestra.run.err:umount: /tmp/cephtest/data/osd.0.data: device is busy. 2012-01-27T16:23:52.492 INFO:teuthology.orchestra.run.err: (In some cases useful info about processes that use 2012-01-27T16:23:52.492 INFO:teuthology.orchestra.run.err: the device is found by lsof(8) or fuser(1)) 2012-01-27T16:23:52.492 INFO:teuthology.orchestra.run.err:umount2: Device or resource busy
Note that there is a "shutting down osd daemons..." but no
2012-01-27T14:57:32.944 INFO:teuthology.task.ceph:Shutting down osd daemons... 2012-01-27T14:57:33.306 INFO:teuthology.task.ceph.osd.1:osd.1: Stopped 2012-01-27T14:57:33.332 INFO:teuthology.task.ceph.osd.0:osd.0: Stopped
that is present in successful runs. Is there a join/wait missing?
seeing this pretty often, causing qa failure noise...
Related issues
Associated revisions
ceph: sync before unmounting btrfs devices
There may still be writes in flight, since the osds may not have
shutdown cleanly. This should prevent EBUSY when unmounting.
Fixes: #1997
History
#1 Updated by Sage Weil almost 12 years ago
- Assignee set to Josh Durgin
#2 Updated by Anonymous almost 12 years ago
- Status changed from New to Duplicate
As far as I can tell, this is exactly #1744, just failing at a different point because of timing differences. The two log entries are missing most likely because something cleared out / never populated ctx.daemons.
#3 Updated by Josh Durgin almost 12 years ago
- Status changed from Duplicate to Resolved
This was different from #1744 - daemons are shut down without waiting for I/O to complete, which causes this issue when trying to unmount. e7672b64334bbb60b83d01ad6ef093d1a1e16692 fixes this by running sync before unmounting.