Bug #1774: client: files become inaccessible in large directories (with snapshots?) - CephFS - Ceph

Actions

Copy link

Bug #1774

closed

client: files become inaccessible in large directories (with snapshots?)

Added by Alexandre Oliva over 12 years ago. Updated almost 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Taking snapshots of certain directories within ceph that hold backups of root filesystems of my openmoko phone causes some files to disappear. After some experimentation, I found out the issue doesn't only happen to files in the snapshots; sometimes I also get failure to access files in the original directories. From the observed behavior, I'm guessing it has to do with some border condition in the mds: the information is there, but it's not retrieved when the file happens to fall at some specific offset within the directory or somesuch. The evidence is that adding or removing files (and letting the mds commit the changes from its log, then starting a fresh mds) makes the faulty file vary, but once the diretory holds exactly the contects from the originally backed up image, the files that fail are always the same, though different ones in 3 different backup images with different sets of packages installed.

The faulty directory, in these 3 cases, has always been /var/lib/opkg/info, that holds multiple files per installed package, such as file lists, control scripts and more. File names are build out of the package name plus a suffix indicating the function, so we end up with long names, and lots of them. When I take a snapshot, we apparently cross a threshold, and then files that end up precisely at the border start to fail.

I attach level 20 debug dumps from the mds. It's surely not a coincidence that the 3 files that find says it can't stat (i.e., they appear as dir entries, but stat/read/write fails) are the ones that match appear at snapid offset messages in the mds logs:

for d in .link/{Om2008.8-orig,shr-testing-2010-03+,shr-testing2011.1-2011-03-17}/usr/lib/opkg/info; do ../../gen-list $d > /dev/null; done
find: `.link/Om2008.8-orig/usr/lib/opkg/info/qtopia-phone-x11-composer-genericcomposer.list': No such file or directory
find: `.link/shr-testing-2010-03+/usr/lib/opkg/info/update-modules.postinst': No such file or directory
find: `.link/shr-testing2011.1-2011-03-17/usr/lib/opkg/info/task-shr-minimal-apps.control': No such file or directory

grep "snapid 22 offset '[^']" ~/mds-baddir.log
2011-12-01 00:37:16.520212 7f2cde2e2700 mds.0.server snapid 22 offset 'qtopia-phone-x11-composer-genericcomposer.list'
2011-12-01 00:37:26.925145 7f2cde2e2700 mds.0.server snapid 22 offset 'update-modules.postinst'
2011-12-01 00:37:36.710803 7f2cde2e2700 mds.0.server snapid 22 offset 'task-shr-minimal-apps.control'

Neat, eh? I attach the compressed mds log.

Files

Download all files

mds-baddir.log.xz (851 KB) mds-baddir.log.xz	mds log	Alexandre Oliva, 11/30/2011 07:21 PM
0001-Start-caching-readdir-results-after-readdir_start.patch (1.07 KB) 0001-Start-caching-readdir-results-after-readdir_start.patch		Alexandre Oliva, 01/09/2012 07:59 PM
gen-1774.bz2 (8.03 KB) gen-1774.bz2	bash script that tests that the problem is fixed	Alexandre Oliva, 01/11/2012 04:24 PM

Actions

Copy link

Updated by Alexandre Oliva over 12 years ago

Some interesting findings... It appears that the problem has nothing to do with the mds, but with the fuse client. Here's why:

1. it doesn't occur with the kernel client (just found it out)

2. the fuse client can access the file perfectly fine before the directory is first read

3. after the directory is first read, the file can no longer be accessed (negative cache hit, I suppose)

4. restarting the mds has no effect

5. remounting the filesystem enables the file to be accessed again, until the directory is next read within that mount

Actions

Copy link

Updated by Sage Weil over 12 years ago

Category set to 26
Target version set to v0.40

Oh, interesting! With a debug client = 20, debug ms = 1 log from ceph-fuse this should be pretty straightforward to nail down...

Actions

Copy link

Updated by Sage Weil over 12 years ago

Status changed from New to Need More Info

Alexandre-

We're heavily focusing on rados for the next couple of weeks, so I don't have time to try to reproduce this. If you can reproduce with ceph-fuse client logs (debug client = 20, debug ms = 1), I can take a look. The '-d' flag to ceph-fuse may also help (libfuse logs). Otherwise it'll need to wait... Sorry!

Actions

Copy link

Updated by Sage Weil over 12 years ago

Subject changed from files become inaccessible in large directories (with snapshots?) to client: files become inaccessible in large directories (with snapshots?)
Status changed from Need More Info to New
Translation missing: en.field_position set to 1
Translation missing: en.field_position changed from 1 to 1078

Actions

Copy link

Updated by Sage Weil over 12 years ago

Target version deleted (~~v0.40~~)
Translation missing: en.field_position deleted (~~1078~~)
Translation missing: en.field_position set to 108

Actions

Copy link

Updated by Alexandre Oliva over 12 years ago

File 0001-Start-caching-readdir-results-after-readdir_start.patch 0001-Start-caching-readdir-results-after-readdir_start.patch added

How about a patch instead of logs? :-)

It turned out that the problem occurred while caching the readdir responses from the second set on; we'd delete the readdir_start_name from the cache if it happened to be in the same frag as the first entry. Oops.

The problem would only manifest itself the second time we tried to list the same dir, for then we'd take entries from the (incomplete) cache. The first access would list the correct entries.

Actions

Copy link