Project

General

Profile

Bug #22986

hadoop-s3a jobs failing with EPERM

Added by Casey Bodley about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
hadoop s3a
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

the task modifies /etc/resolv.conf to set up dnsmasq, but something changed and the write to /etc/resolv.conf is now failing with EPERM:

Command failed on ovh053 with status 1: 'sudo python -c \'import shutil, sys; shutil.copyfileobj(sys.stdin, file(sys.argv1, "wb"))\' /etc/resolv.conf'

http://qa-proxy.ceph.com/teuthology/yuriw-2018-02-10_16:42:19-rgw-luminous-distro-basic-smithi/2180605/teuthology.log


Related issues

Related to rgw - Bug #23454: rgw/s3atests failure: fs contract testBlockReadZeroByteFile/testSeekZeroByteFile/testOpenReadZeroByteFile Won't Fix 03/23/2018

History

#1 Updated by Vasu Kulkarni about 6 years ago

  • Assignee set to Vasu Kulkarni

#2 Updated by Vasu Kulkarni about 6 years ago

  • Priority changed from Normal to High

#3 Updated by Vasu Kulkarni about 6 years ago

Running on 7.3 to see why copying to /etc/resolv.conf is failing for 7.4

http://pulpito.ceph.com/vasu-2018-02-22_03:54:33-rgw-master-distro-basic-ovh/

#4 Updated by Yehuda Sadeh about 6 years ago

  • Status changed from New to In Progress

#6 Updated by Casey Bodley about 6 years ago

  • Status changed from In Progress to 7

#7 Updated by Vasu Kulkarni about 6 years ago

Not sure why this started failing..

http://pulpito.ceph.com/vasu-2018-03-02_02:00:23-rgw-luminous-distro-basic-ovh/

Command works fine when run manually: 'sudo radosgw-admin user create --uid s3a --display-name=s3a cephtests --access-key=EGAQRD2ULOIFKFSKCT4F --secret-key=zi816w1vZKfaSM85Cl0BxXTwSLyN7zB4RbTswrGb --email='

I can only doubt the keys on that node could be not correct

and on another run, I hit cp issue which is very weird, since it just worked for couple of runs before:

http://pulpito.ceph.com/vasu-2018-03-02_02:01:27-rgw-luminous-distro-basic-ovh/
'sudo cp /home/ubuntu/cephtest/resolv.bak /etc/resolv.conf'

Still looking..

#8 Updated by Vasu Kulkarni about 6 years ago

CentOS doesn't allow the overwrite of resolve.conf, I guess some network manager changes, I need to check with David Galloway on that, but for Ubuntu its fine.

#9 Updated by Vasu Kulkarni about 6 years ago

  • Assignee changed from Vasu Kulkarni to David Galloway

David,

Do you know why overwriting resolv.conf fails for centos but works for ubuntu?

http://pulpito.ceph.com/vasu-2018-03-02_02:01:27-rgw-luminous-distro-basic-ovh/
'sudo cp /home/ubuntu/cephtest/resolv.bak /etc/resolv.conf'

#10 Updated by David Galloway about 6 years ago

  • Assignee changed from David Galloway to Vasu Kulkarni
2018-03-02T03:03:30.596 INFO:teuthology.orchestra.run.ovh052.stderr:cp: cannot create regular file ‘/etc/resolv.conf’: Permission denied

[dgalloway@ovh009 ~]$ lsattr /etc/resolv.conf
----i----------- /etc/resolv.conf

[root@ovh009 ~]# cat /etc/resolv.conf 
# Immutable copy of resolv.conf for Sepia use only
search front.sepia.ceph.com
domain front.sepia.ceph.com
nameserver 8.8.8.8

We made /etc/resolv.conf immutable on OVH nodes because of all the DNS failures their Openstack instances have when using their own nameservers. Making it was immutable is hacky but easiest instead of fighting with all the daemons that try to update it (cloud-init, NetworkManager, etc.)

You could have the test sudo chattr -i /etc/resolv.conf before copying the resolv.tmp file into place.

#11 Updated by Vasu Kulkarni about 6 years ago

Thanks David, that helps.

#12 Updated by Casey Bodley almost 6 years ago

  • Related to Bug #23454: rgw/s3atests failure: fs contract testBlockReadZeroByteFile/testSeekZeroByteFile/testOpenReadZeroByteFile added

#13 Updated by Casey Bodley almost 6 years ago

  • Related to Bug #23531: s3a/2.8.0 fs.contract failure Seek/Rename/ComplexDirActions/RecursiveRootListing/EmptyRootDirNonRecursive added

#14 Updated by Yuri Weinstein almost 6 years ago

Vasu Kulkarni wrote:

Testing with: https://github.com/ceph/ceph/pull/20678

merged

#15 Updated by Casey Bodley almost 6 years ago

  • Status changed from 7 to Resolved

#16 Updated by Vasu Kulkarni almost 6 years ago

  • Related to deleted (Bug #23531: s3a/2.8.0 fs.contract failure Seek/Rename/ComplexDirActions/RecursiveRootListing/EmptyRootDirNonRecursive)

Also available in: Atom PDF