Project

General

Profile

Actions

Bug #1997

closed

teuthology: wait for clean osd shutdown before umount

Added by Sage Weil about 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
teuthology
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/master-2012-01-27_13:29:47/9361


2012-01-27T16:23:52.280 INFO:teuthology.task.ceph:Shutting down osd daemons...
2012-01-27T16:23:52.280 INFO:teuthology.task.ceph:Shutting down mon daemons...
2012-01-27T16:23:52.282 INFO:teuthology.task.ceph.mon.a.err:*** Caught signal (Terminated) **
2012-01-27T16:23:52.282 INFO:teuthology.task.ceph.mon.a.err: in thread 7f8993043780. Shutting down.
2012-01-27T16:23:52.295 INFO:teuthology.task.ceph.mon.a:mon.a: Stopped
2012-01-27T16:23:52.328 INFO:teuthology.task.ceph.mon.c.err:*** Caught signal (Terminated) **
2012-01-27T16:23:52.328 INFO:teuthology.task.ceph.mon.c.err: in thread 7f0cbd3cc780. Shutting down.
2012-01-27T16:23:52.341 INFO:teuthology.task.ceph.mon.c:mon.c: Stopped
2012-01-27T16:23:52.342 INFO:teuthology.task.ceph.mon.b.err:*** Caught signal (Terminated) **
2012-01-27T16:23:52.343 INFO:teuthology.task.ceph.mon.b.err: in thread 7fb9c5f48780. Shutting down.
2012-01-27T16:23:52.363 INFO:teuthology.task.ceph.mon.b:mon.b: Stopped
2012-01-27T16:23:52.363 INFO:teuthology.task.ceph:Grabbing cluster log from ubuntu@sepia5.ceph.dreamhost.com mon.a...
2012-01-27T16:23:52.363 DEBUG:teuthology.orchestra.run:Running: 'cat -- /tmp/cephtest/data/mon.a/log'
2012-01-27T16:23:52.385 INFO:teuthology.task.ceph:Checking cluster ceph.log for badness...
2012-01-27T16:23:52.385 DEBUG:teuthology.orchestra.run:Running: "egrep '\\[ERR\\]|\\[WRN\\]|\\[SEC\\]' /tmp/cephtest/data/mon.a/log | egrep -v 'clocks not synchronized' | head -n 1" 
2012-01-27T16:23:52.428 INFO:teuthology.task.ceph:Unmounting /tmp/cephtest/data/osd.0.data on ubuntu@sepia5.ceph.dreamhost.com
2012-01-27T16:23:52.428 DEBUG:teuthology.orchestra.run:Running: 'sudo umount -f /tmp/cephtest/data/osd.0.data'
2012-01-27T16:23:52.491 INFO:teuthology.orchestra.run.err:umount2: Device or resource busy
2012-01-27T16:23:52.492 INFO:teuthology.orchestra.run.err:umount: /tmp/cephtest/data/osd.0.data: device is busy.
2012-01-27T16:23:52.492 INFO:teuthology.orchestra.run.err:        (In some cases useful info about processes that use
2012-01-27T16:23:52.492 INFO:teuthology.orchestra.run.err:         the device is found by lsof(8) or fuser(1))
2012-01-27T16:23:52.492 INFO:teuthology.orchestra.run.err:umount2: Device or resource busy

Note that there is a "shutting down osd daemons..." but no
2012-01-27T14:57:32.944 INFO:teuthology.task.ceph:Shutting down osd daemons...
2012-01-27T14:57:33.306 INFO:teuthology.task.ceph.osd.1:osd.1: Stopped
2012-01-27T14:57:33.332 INFO:teuthology.task.ceph.osd.0:osd.0: Stopped

that is present in successful runs. Is there a join/wait missing?

seeing this pretty often, causing qa failure noise...


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #1744: teuthology: race with daemon shutdown?ResolvedJosh Durgin11/20/2011

Actions
Actions #1

Updated by Sage Weil about 12 years ago

  • Assignee set to Josh Durgin
Actions #2

Updated by Anonymous about 12 years ago

  • Status changed from New to Duplicate

As far as I can tell, this is exactly #1744, just failing at a different point because of timing differences. The two log entries are missing most likely because something cleared out / never populated ctx.daemons.

Actions #3

Updated by Josh Durgin about 12 years ago

  • Status changed from Duplicate to Resolved

This was different from #1744 - daemons are shut down without waiting for I/O to complete, which causes this issue when trying to unmount. e7672b64334bbb60b83d01ad6ef093d1a1e16692 fixes this by running sync before unmounting.

Actions

Also available in: Atom PDF