Project

General

Profile

Actions

Bug #561

closed

snaptest-2 doesn't execute properly

Added by Greg Farnum over 13 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

90%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Checked it on cfuse and kclient:

    create snaptest-2.sh/.snap/snap-subdir-test
mkdir: cannot create directory `snaptest-2.sh/.snap/snap-subdir-test': Not a directory
        create snaptest-2.sh/30 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/30: Not a directory
        create snaptest-2.sh/31 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/31: Not a directory
        create snaptest-2.sh/32 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/32: Not a directory
        create snaptest-2.sh/33 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/33: Not a directory
        create snaptest-2.sh/34 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/34: Not a directory
        create snaptest-2.sh/35 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/35: Not a directory
        create snaptest-2.sh/36 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/36: Not a directory
        create snaptest-2.sh/37 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/37: Not a directory
        create snaptest-2.sh/38 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/38: Not a directory
        create snaptest-2.sh/39 file after the snapshot
./snaptest-2.sh: line 30: snaptest-2.sh/39: Not a directory

Kept logs from the cfuse run, the client and server-side stuff.

Actions #1

Updated by Greg Farnum over 13 years ago

Okay, looks like this may be an issue with the test rather than Ceph. I just copied it into the root of the ceph mount and ran it, and so when it did an ls it got itself included in the list, and then tried to make a snapshot under itself, but it's a file not a dir!
Running on cfuse also exhibits other issues (starting with Directory not empty) that I didn't see with the kclient, but these may be similar problems. Will investigate further.

Actions #2

Updated by Sage Weil over 13 years ago

  • Priority changed from Normal to Immediate
Actions #3

Updated by Sage Weil over 13 years ago

  • Priority changed from Immediate to Normal
Actions #4

Updated by Sage Weil over 13 years ago

  • Estimated time set to 1:00 h
  • Source set to 2
Actions #5

Updated by Greg Farnum over 13 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 90

I think I may have finally nailed this problem, or at least found a band-aid by more aggressively removing the I_COMPLETE flag from inodes when messing around with the directory contents.
Am re-running the test right now as I forgot to get one of the necessary log files to do a proper check.

Unfortunately, in addition to #570, I found another cmds assert failure when I looked over the last completed test that will need to be diagnosed. I'll make a ticket when I can see it with proper logging.

Actions #6

Updated by Greg Farnum over 13 years ago

I ran the test again and didn't get an mds crash. There was one issue remaining:

Delete all the files and directories ...
rm: cannot remove `./100/_test3_1/10': Read-only file system
rm: cannot remove `./100/_test3_1/11': Read-only file system
rm: cannot remove `./100/_test3_1/12': Read-only file system
rm: cannot remove `./100/_test3_1/13': Read-only file system
rm: cannot remove `./100/_test3_1/14': Read-only file system
rm: cannot remove `./100/_test3_1/15': Read-only file system
rm: cannot remove `./100/_test3_1/16': Read-only file system
rm: cannot remove `./100/_test3_1/17': Read-only file system
rm: cannot remove `./100/_test3_1/18': Read-only file system
rm: cannot remove `./100/_test3_1/19': Read-only file system
rm: cannot remove `./100/_test3_1/20': Read-only file system
rm: cannot remove `./100/_test3_1/30': Read-only file system
rm: cannot remove `./100/_test3_1/31': Read-only file system
rm: cannot remove `./100/_test3_1/32': Read-only file system
rm: cannot remove `./100/_test3_1/33': Read-only file system
rm: cannot remove `./100/_test3_1/34': Read-only file system
rm: cannot remove `./100/_test3_1/35': Read-only file system
rm: cannot remove `./100/_test3_1/36': Read-only file system
rm: cannot remove `./100/_test3_1/37': Read-only file system
rm: cannot remove `./100/_test3_1/38': Read-only file system
rm: cannot remove `./100/_test3_1/39': Read-only file system

When I shut down the system and then re-mounted I found that directory 100 was still there and it had its original files (1-10 plus whatever else is supposed to be there). An rm -r worked fine on it. The client log from that run is in pudgy:~gregf/logs/cfuse_snaptest2/client_read_only_error.log

I pushed that code to the rc branch, updated, and started another run. The client log is at ~gregf/logs/cfuse_snaptest2/new_client.log; the snaptest-2 output is in snaptest-2.log; and the server logs are in ~gregf/ceph/src/out. We'll see if there are still issues.

Actions #7

Updated by Sage Weil over 13 years ago

  • Target version changed from v0.23 to v0.23.1
Actions #8

Updated by Sage Weil over 13 years ago

  • Status changed from In Progress to Resolved

Figured this out. LSSNAPs was adding the snap dentries to the cache under the parent dir instead of the hidden .snap snapdir. commit:c5b2d28bc7fd4ced676b89360742c233dc045fa3

Actions #9

Updated by John Spray over 7 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)
  • Target version deleted (v0.23.1)

Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.

Actions

Also available in: Atom PDF