Bug #14805
closedHadoop tests failing with EPERM
0%
Description
Most recent instance:
http://pulpito.ceph.com/teuthology-2016-02-17_18:12:06-hadoop-jewel---basic-mira/
Here's the earliest instance I could find:
http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2016-01-16_22:12:02-hadoop-master---basic-openstack/
Greg & Noah briefly discussed this on ceph qa list for this run
http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2016-01-18_22:12:01-hadoop-master---basic-openstack/
Failing on jewel and master consistently, but mixed in with infrastructure issues.
Updated by Zheng Yan about 8 years ago
- Subject changed from Hadoop tests failing with EACCESS to Hadoop tests failing with EPERM
Updated by Zheng Yan about 8 years ago
- Status changed from New to Fix Under Review
I have trouble to run the test on local machine, let's try disable client_permissions
Updated by Zheng Yan about 8 years ago
Updated by Greg Farnum about 8 years ago
- Assignee set to Zheng Yan
Do you have any idea what about the client permission checking is busting Hadoop? We want to fix it properly (or at least band-aid it automatically :p), not just swap the qa suite runs.
Updated by Zheng Yan about 8 years ago
old libcephfs only has permission check for open. Now, It has full permission checks (open, lookup, setattr ...)
Updated by John Spray about 8 years ago
Maybe this has same issue as the python libcephfs tests did, they were creating files with mode 0, which used to work
Updated by Greg Farnum about 8 years ago
Wait, can you expand on that John? I wasn't really looking at the python tests, although I know it involved root ownership — didn't you just start running them as sudo?
I think that should generally be VFS-controlled, not something in our Client environment (I mean, I know we're checking mode and uid now, but the VFS is also gating on those first, so if we're disagreeing it's probably a bug?).
Updated by John Spray about 8 years ago
Greg Farnum wrote:
Wait, can you expand on that John? I wasn't really looking at the python tests, although I know it involved root ownership — didn't you just start running them as sudo?
Yeah, it was a s/nosetests/sudo nosetests/
I think that should generally be VFS-controlled, not something in our Client environment (I mean, I know we're checking mode and uid now, but the VFS is also gating on those first, so if we're disagreeing it's probably a bug?).
Right, but there's no VFS in the libcephfs python tests or in hadoop.
On this subject, the C libcephfs API has the ll functions (used by ganesha) that enable it to pass through UIDs. But the python API and the 'normal' C API doesn't have that, and just reads the UID from the environment. I'm thinking we maybe want to add uid/gid to the ceph_create function so that users are explicitly picking UID (since getting it from environment was not enforced anyway)
Updated by Greg Farnum about 8 years ago
d'oh, right. Okay, I get the problem now. I've run this through a couple of times in my latest integration branch, btw, and it seems fine (although we're still hitting a failure in hadoop sometimes; need to dig into that).
Updated by Greg Farnum about 8 years ago
- Status changed from Fix Under Review to Resolved
Updated by Patrick Donnelly about 5 years ago
- Category deleted (
48) - Labels (FS) Java/Hadoop added