Actions
Bug #17517
closedceph-fuse: does not handle ENOMEM during remount
Status:
Resolved
Priority:
High
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
When running 8 ceph-fuse clients on a single Linode VM with 2GB memory, I observed the clients would die under load with this assertion:
2016-10-05 21:01:32.159767 7fc11de19700 -1 client.15772 tried to remount (to trim kernel dentries) and got error -1 2016-10-05 21:01:32.207265 7fc11de19700 -1 /srv/autobuild-ceph/gitbuilder.git/build/rpmbuild/BUILD/ceph-11.0.0/src/client/Client.cc: In function 'virtual void C_Client_Remount::finish(int)' thread 7fc11de19700 time 2016-10-05 21:01:32.159874 /srv/autobuild-ceph/gitbuilder.git/build/rpmbuild/BUILD/ceph-11.0.0/src/client/Client.cc: 3949: FAILED assert(0 == "failed to remount for kernel dentry trimming") ceph version v11.0.0-3132-g1b90ab3 (1b90ab375eaa3dba76c5dda8b504a3c2d77faa58) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x55ae80ccebd5] 2: (C_Client_Remount::finish(int)+0xdd) [0x55ae80bf9b4d] 3: (Context::complete(int)+0x9) [0x55ae80bf6b49] 4: (Finisher::finisher_thread_entry()+0x216) [0x55ae80ccde36] 5: (()+0x7dc5) [0x7fc12b53edc5] 6: (clone()+0x6d) [0x7fc12a425ced] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
I tracked this down via strace to a failure to fork (https://github.com/ceph/ceph/blob/cf1785881fda92187b4996a9d839100e8513b61a/src/client/fuse_ll.cc#L891):
17719 clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD, parent_tidptr=0x7fc11de17e50) = -1 ENOMEM (Cannot allocate memory)
(Not really unexpected to get that error!) This may be a situation where we want to retry or at least give a better error message.
Updated by Patrick Donnelly over 6 years ago
- Related to Bug #22254: client: give more descriptive error message for remount failures added
Updated by Patrick Donnelly about 6 years ago
- Subject changed from ceph-fuse does not handle ENOMEM during remount to ceph-fuse: does not handle ENOMEM during remount
- Status changed from New to Resolved
- Component(FS) Client added
I'm going to mark this as resolved since handling ENOMEM is generally a waste of time since another operation will fail similarly (or with a segfault).
Actions