Project

General

Profile

Bug #8576

teuthology: nfs tests failing on umount

Added by Greg Farnum about 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
NFS (Linux Kernel)
Target version:
-
Start date:
06/10/2014
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:


Related issues

Duplicated by fs - Bug #8749: knfs: EBUSY on umount Duplicate 07/05/2014

History

#1 Updated by Greg Farnum about 5 years ago

teuthology-2014-06-09_23:02:09-knfs-master-testing-basic-plana/303822/
teuthology-2014-06-09_23:02:09-knfs-master-testing-basic-plana/303825/

I looked a little more closely and it doesn't appear to be teuthology being dumb at first glance. It's already cleaned up after NFS before trying to umount cephfs AFAICT.

#2 Updated by Sage Weil about 5 years ago

  • Priority changed from Normal to High

#7 Updated by Sage Weil almost 5 years ago

  • Status changed from New to Need More Info

#8 Updated by Greg Farnum almost 5 years ago

2014-09-09T10:14:15.908 INFO:tasks.workunit.client.1.plana29.stdout:6/999: dwrite d4/d5/d93/d6c/fe8 [0,1048576] 0
2014-09-09T10:14:15.908 INFO:teuthology.orchestra.run.plana29:Running: 'sudo rm -rf -- /home/ubuntu/cephtest/mnt.1/client.1/tmp'
2014-09-09T10:15:21.823 INFO:tasks.workunit:Stopping ['suites/fsstress.sh'] on client.1...
2014-09-09T10:15:21.823 INFO:teuthology.orchestra.run.plana29:Running: 'rm -rf -- /home/ubuntu/cephtest/workunits.list /home/ubuntu/cephtest/workunit.client.1'
2014-09-09T10:15:21.842 DEBUG:teuthology.parallel:result is None
2014-09-09T10:15:21.842 INFO:teuthology.orchestra.run.plana29:Running: 'sudo rm -rf -- /home/ubuntu/cephtest/mnt.1/client.1'
2014-09-09T10:15:22.029 INFO:tasks.workunit:Deleted dir /home/ubuntu/cephtest/mnt.1/client.1
2014-09-09T10:15:22.070 DEBUG:teuthology.run_tasks:Unwinding manager nfs
2014-09-09T10:15:22.070 INFO:teuthology.task.nfs:Unmounting nfs clients...
2014-09-09T10:15:22.070 DEBUG:teuthology.task.nfs:Unmounting nfs client client.1...
2014-09-09T10:15:22.070 INFO:teuthology.orchestra.run.plana29:Running: 'sudo umount /home/ubuntu/cephtest/mnt.1'
2014-09-09T10:15:22.186 INFO:teuthology.orchestra.run.plana29:Running: 'rmdir -- /home/ubuntu/cephtest/mnt.1'
2014-09-09T10:15:22.241 DEBUG:teuthology.run_tasks:Unwinding manager knfsd
2014-09-09T10:15:22.241 INFO:teuthology.task.knfsd:Unexporting nfs server...
2014-09-09T10:15:22.241 DEBUG:teuthology.task.knfsd:Unexporting client client.0...
2014-09-09T10:15:22.242 INFO:teuthology.orchestra.run.plana40:Running: 'sudo exportfs -au'
2014-09-09T10:15:22.289 DEBUG:teuthology.run_tasks:Unwinding manager kclient
2014-09-09T10:15:22.289 INFO:tasks.kclient:Unmounting kernel clients...
2014-09-09T10:15:22.289 INFO:teuthology.orchestra.run.plana40:Running: 'sudo umount /home/ubuntu/cephtest/mnt.0'
2014-09-09T10:15:22.381 INFO:teuthology.orchestra.run.plana40.stderr:umount: /home/ubuntu/cephtest/mnt.0: device is busy.
2014-09-09T10:15:22.382 INFO:teuthology.orchestra.run.plana40.stderr:        (In some cases useful info about processes that use
2014-09-09T10:15:22.382 INFO:teuthology.orchestra.run.plana40.stderr:         the device is found by lsof(8) or fuser(1))
2014-09-09T10:15:22.383 ERROR:teuthology.run_tasks:Manager failed: kclient
Traceback (most recent call last):

NFS was mounted async, and we're trying to unmount Ceph 47 ms after we remove the nfs export. I can't find in a quick skim anywhere if exportfs will block until everything is flushed or not....maybe we just need to wait a bit? Anybody know what the rules are, or a good way to check?

#9 Updated by Greg Farnum almost 5 years ago

  • Assignee set to Greg Farnum

Been playing around with this some.

#10 Updated by Greg Farnum almost 5 years ago

  • Assignee deleted (Greg Farnum)

#11 Updated by Greg Farnum almost 5 years ago

Is there any chance that just running a sync on the node prior to trying to "exportfs -au" might prevent this? I'm hesitant to just add it in case that masks something else we should be worried about, but given how I've not seen this failure when testing locally I don't know how else to check it besides just pushing it to teuthology. :/

#12 Updated by Greg Farnum almost 5 years ago

  • Status changed from Need More Info to Testing

Trying the sync on Sage's go-ahead. :)
commit:56223ce98b659fe7b25b55161ef8163495f438fc in teuthology.

#13 Updated by Greg Farnum almost 5 years ago

  • Status changed from Testing to Need More Info

#14 Updated by Zheng Yan almost 5 years ago

I notice that if I execute 'service nfs stop' first, umounting cephfs always successes. 'service nfs stop' runs two commands: rpc.nfsd 0 and exportfs -f

#15 Updated by Greg Farnum almost 5 years ago

  • Status changed from Need More Info to Testing

teuthology commit:4f2957c42d0f76a399cb26c660ede9243c095779 runs those commands as well as the previous ones.

#16 Updated by Greg Farnum over 4 years ago

  • Status changed from Testing to Resolved

Haven't seen this since we made those changes!

Also available in: Atom PDF