Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2019-05-29T12:15:02ZCeph
Redmine ceph-volume - Bug #40062 (New): check name of branch to set number of osdshttps://tracker.ceph.com/issues/400622019-05-29T12:15:02ZAlfredo Dezaadeza@redhat.com
<p>This line in conftest.py:</p>
<pre>
if ceph_dev_branch in ['luminous', 'mimic']:
num_osd_ports = 2
</pre>
<p>depends on the name of the branch, but it should be less strict, and use something like:</p>
<pre>
if 'luminous' in ceph_dev_branch or 'mimic' in ceph_dev_branch:
num_osd_ports = 2
</pre>
<p>Because sometimes tests might be triggered with a special branch. Yuri does this to test its integration branches for example.</p> ceph-volume - Bug #24793 (New): ceph-volume fails to zap a device (wipefs problem)https://tracker.ceph.com/issues/247932018-07-06T13:35:37ZSébastien Hanseb@redhat.com
<pre>
[root@ceph-osd0 /]# ps fauaxf
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
ceph 1616 0.1 3.8 845844 19340 ? Ssl 12:31 0:06 ceph-osd -i 1 --setuser ceph --setgroup disk
root 1229 0.0 0.3 11820 1692 pts/4 Ss 12:04 0:00 bash
root 3355 0.0 0.3 51704 1700 pts/4 R+ 13:31 0:00 \_ ps fauaxf
root 836 0.0 0.3 11820 1544 pts/3 Ss+ 08:53 0:00 bash
root 5 0.0 0.1 11820 556 pts/1 Ss+ Jul05 0:00 bash
root 1 0.0 0.0 4360 24 ? Ss Jul05 0:00 sleep 365d
[root@ceph-osd0 /]# kill 1616
[root@ceph-osd0 /]# umount /var/lib/ceph/osd/ceph-1
[root@ceph-osd0 /]# ceph-volume lvm zap /dev/sdb
--> Zapping: /dev/sdb
Running command: /usr/sbin/wipefs --all /dev/sdb
stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
--> RuntimeError: command returned non-zero exit status: 1
[root@ceph-osd0 /]# df
Filesystem 1K-blocks Used Available Use% Mounted on
overlay 39269648 1541404 37728244 4% /
tmpfs 250012 0 250012 0% /sys/fs/cgroup
devtmpfs 241300 0 241300 0% /dev
shm 65536 0 65536 0% /dev/shm
/dev/mapper/VolGroup00-LogVol00 39269648 1541404 37728244 4% /etc/ceph
tmpfs 250012 29424 220588 12% /run/lvm/lvmetad.socket
[root@ceph-osd0 /]# ceph-volume lvm zap /dev/sdb
--> Zapping: /dev/sdb
Running command: /usr/sbin/wipefs --all /dev/sdb
stderr: wipefs: error: /dev/sdb: probing initialization failed: Device or resource busy
--> RuntimeError: command returned non-zero exit status: 1
[root@ceph-osd0 /]# ps faux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1229 0.0 0.3 11820 1700 pts/4 Ss 12:04 0:00 bash
root 3381 0.0 0.3 51704 1688 pts/4 R+ 13:32 0:00 \_ ps faux
root 836 0.0 0.3 11820 1544 pts/3 Ss+ 08:53 0:00 bash
root 5 0.0 0.1 11820 556 pts/1 Ss+ Jul05 0:00 bash
root 1 0.0 0.0 4360 24 ? Ss Jul05 0:00 sleep 365d
[root@ceph-osd0 /]# fuser /dev/sdb
</pre> teuthology - Bug #16244 (New): koji/koji_task require koji package which isn't in standard RHELhttps://tracker.ceph.com/issues/162442016-06-12T14:51:47ZIlya Dryomov
<p>teuthology/task/kernel.py:<br /><pre>
# FIXME: this install should probably happen somewhere else
# but I'm not sure where, so we'll leave it here for now.
install_package('koji', role_remote)
</pre><br />Need to wrap it into enable/disable EPEL7 or something like that.</p> teuthology - Bug #14240 (In Progress): teuthology-suite should check presence/translate to sha1 b...https://tracker.ceph.com/issues/142402016-01-05T19:38:34ZDan Mickdmick@redhat.com
<p>teuthology-suite does several things based on the 'default distro/version', including at least:</p>
<p>1) translating branch/tag to sha1<br />2) checking for presence of a requested branch/tag/sha1</p>
<p>This leads to confusing situations where, say, the centos gb thinks 'testing' is sha1a, but the ubuntu gb thinks it's sha1b, and so an otherwise-proper job fails or installs the wrong version.</p>
<p>There's no reason the gbs should have to be in sync.</p> teuthology - Bug #13159 (In Progress): move teuthology docs hosting to docs.ceph.comhttps://tracker.ceph.com/issues/131592015-09-18T14:37:19ZAndrew Schoenaschoen@redhat.com
<p>The teuthology docs used to live here: <a class="external" href="http://ceph.com/teuthology/docs">http://ceph.com/teuthology/docs</a></p>
<p>We've since switched all our docs over to: docs.ceph.com.</p>
<p>We need to change the teuthology-docs job in ceph-build to build and push the teuthology docs to the new location.</p> teuthology - Bug #13012 (New): Could not retrieve mirrorlisthttps://tracker.ceph.com/issues/130122015-09-09T15:50:39ZAndrew Schoenaschoen@redhat.com
<p>In <a class="issue tracker-1 status-3 priority-6 priority-high2 closed" title="Bug: epel repo issues (Resolved)" href="https://tracker.ceph.com/issues/12778">#12778</a> we increased the yum timeout and it seems to have helped out, but this looks to be a similar if not the same type of failure. Do we increase the timeout more or is this an acceptable failure as it's the only one I've seen since closing <a class="issue tracker-1 status-3 priority-6 priority-high2 closed" title="Bug: epel repo issues (Resolved)" href="https://tracker.ceph.com/issues/12778">#12778</a>.</p>
<p>Log: <a class="external" href="http://qa-proxy.ceph.com/teuthology/ubuntu-2015-09-09_07:03:51-fs-greg-fs-testing-testing-basic-multi/1047266/teuthology.log">http://qa-proxy.ceph.com/teuthology/ubuntu-2015-09-09_07:03:51-fs-greg-fs-testing-testing-basic-multi/1047266/teuthology.log</a></p>
<pre>
2015-09-09T07:17:28.467 INFO:teuthology.orchestra.run.burnupi13.stdout:Processing triggers for libc-bin (2.19-0ubuntu6.1) ...
2015-09-09T07:17:28.761 DEBUG:teuthology.parallel:result is None
2015-09-09T07:18:59.164 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.164 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.164 INFO:teuthology.orchestra.run.plana53.stderr: One of the configured repositories failed (Unknown),
2015-09-09T07:18:59.164 INFO:teuthology.orchestra.run.plana53.stderr: and yum doesn't have enough cached data to continue. At this point the only
2015-09-09T07:18:59.165 INFO:teuthology.orchestra.run.plana53.stderr: safe thing yum can do is fail. There are a few ways to work "fix" this:
2015-09-09T07:18:59.165 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.165 INFO:teuthology.orchestra.run.plana53.stderr: 1. Contact the upstream for the repository and get them to fix the problem.
2015-09-09T07:18:59.165 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.165 INFO:teuthology.orchestra.run.plana53.stderr: 2. Reconfigure the baseurl/etc. for the repository, to point to a working
2015-09-09T07:18:59.165 INFO:teuthology.orchestra.run.plana53.stderr: upstream. This is most often useful if you are using a newer
2015-09-09T07:18:59.165 INFO:teuthology.orchestra.run.plana53.stderr: distribution release than is supported by the repository (and the
2015-09-09T07:18:59.165 INFO:teuthology.orchestra.run.plana53.stderr: packages for the previous distribution release still work).
2015-09-09T07:18:59.166 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.166 INFO:teuthology.orchestra.run.plana53.stderr: 3. Disable the repository, so yum won't use it by default. Yum will then
2015-09-09T07:18:59.166 INFO:teuthology.orchestra.run.plana53.stderr: just ignore the repository until you permanently enable it again or use
2015-09-09T07:18:59.166 INFO:teuthology.orchestra.run.plana53.stderr: --enablerepo for temporary usage:
2015-09-09T07:18:59.166 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.166 INFO:teuthology.orchestra.run.plana53.stderr: yum-config-manager --disable <repoid>
2015-09-09T07:18:59.166 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.167 INFO:teuthology.orchestra.run.plana53.stderr: 4. Configure the failing repository to be skipped, if it is unavailable.
2015-09-09T07:18:59.167 INFO:teuthology.orchestra.run.plana53.stderr: Note that yum will try to contact the repo. when it runs most commands,
2015-09-09T07:18:59.167 INFO:teuthology.orchestra.run.plana53.stderr: so will have to try and fail each time (and thus. yum will be be much
2015-09-09T07:18:59.167 INFO:teuthology.orchestra.run.plana53.stderr: slower). If it is a very temporary problem though, this is often a nice
2015-09-09T07:18:59.168 INFO:teuthology.orchestra.run.plana53.stderr: compromise:
2015-09-09T07:18:59.168 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.168 INFO:teuthology.orchestra.run.plana53.stderr: yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
2015-09-09T07:18:59.168 INFO:teuthology.orchestra.run.plana53.stderr:
2015-09-09T07:18:59.168 INFO:teuthology.orchestra.run.plana53.stderr:Cannot find a valid baseurl for repo: updates/7/x86_64
2015-09-09T07:18:59.169 INFO:teuthology.orchestra.run.plana53.stdout:Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=updates&infra=stock error was
2015-09-09T07:18:59.169 INFO:teuthology.orchestra.run.plana53.stdout:14: curl#56 - "Recv failure: Connection reset by peer"
2015-09-09T07:18:59.190 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__
for result in self:
File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next
resurrect_traceback(result)
File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback
return func(*args, **kwargs)
File "/home/teuthworker/src/teuthology_master/teuthology/task/install.py", line 281, in _update_rpm_package_list_and_install
remote.run(args=['sudo', 'yum', 'install', cpack, '-y'])
File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/remote.py", line 156, in run
r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 378, in run
r.wait()
File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 114, in wait
label=self.label)
CommandFailedError: Command failed on plana53 with status 1: 'sudo yum install ceph-debuginfo -y'
</pre> teuthology - Bug #12669 (Fix Under Review): remote.os.codename returns empty string on debianhttps://tracker.ceph.com/issues/126692015-08-11T17:14:45ZAndrew Schoenaschoen@redhat.com
<p>I think this is probably a bug / oversight. We realized this because of <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: The install task fails to correctly check gitbuilder for debian (Resolved)" href="https://tracker.ceph.com/issues/12668">#12668</a></p> teuthology - Fix #12349 (Need More Info): revisit packages/common_packageshttps://tracker.ceph.com/issues/123492015-07-16T04:27:06ZDan Mickdmick@redhat.com
<p>While reviewing docs for/trying to understand packages vs common_packages, I wondered if it would not be possible to use the same var name, "packages", at various levels of the tree, and have the more-specific levels do a list append; that way, each level defines which packages it knows to be necessary without the artificial split between "high-level" in common_packages and "low level" in packages (and the implication that there are only two levels of owners of package lists).</p>
<p>Something like "packages: {{packages|list}} + {{additional_packages|list}}" <br />was suggested. It's not clear if Ansible allows the RHS to refer to the prior value of a variable in order to reassign it to the LHS.</p> teuthology - Fix #12348 (New): clean up modify_fstabhttps://tracker.ceph.com/issues/123482015-07-16T04:20:38ZDan Mickdmick@redhat.com
<p>I don't see any reason why modify_fstab can't run on any machine; true, it's not <strong>necessary</strong> to preserve fs mount options on VMs that will be destroyed before they are rebooted, but ... someone might reboot the VM without destroying, and editing fstab should not be harmful, so the special case is unnecessary.</p>
<p>Figure out why this was deemed necessary and justify it better or remove it.</p>
<p>Other outstanding weirdnesses around this code:</p>
<p>1) it's crazy Perl; if the only reason we need the Perl is because we need<br />the mountpoint and the UUID, as the comment in roles/testnode/tasks/apt_systems.yml claims, those should be available through facts.</p>
<p>2) why is this for apt systems only? If it's necessary, surely it's necessary regardless of the package type of the system?</p> teuthology - Fix #12347 (New): is lab_domain necessary?https://tracker.ceph.com/issues/123472015-07-16T04:16:56ZDan Mickdmick@redhat.com
<p>testnode has a lab_domain var that controls whether the "domain is stripped from the hostname" that seems only relevant to redhat systems? None of that makes sense, and means we don't understand something, or else it's really weird and should be commented so that future reviewers don't have the same problem of being unable to remember why this exists.</p> teuthology - Fix #12346 (New): revisit start_rpcbind: is it necessary to configure by hand?https://tracker.ceph.com/issues/123462015-07-16T04:13:59ZDan Mickdmick@redhat.com
<ol>
<li>some distros need to start rpcbind before</li>
<li>trying to use nfs while others don't.<br />start_rpcbind: true</li>
</ol>
<p>It seems, on its face, that it should be possible to start rpcbind only if it's not already running and avoid having to configure this per-distro. andrewschoen couldn't remember details when documenting the var, so this task is to determine whether the var is really necessary, or if we can't sense this condition automatically.</p> teuthology - Bug #11248 (New): ceph-deploy suite needs to run ceph-deploy before installhttps://tracker.ceph.com/issues/112482015-03-26T21:53:49ZAndrew Schoenaschoen@redhat.com
<p>We are now verifying that the installed version of ceph is what was asked for in the yaml. This check happens in the install task. However, when install is used with ceph-deploy, the install task does not install ceph, ceph-deploy does.</p>
<p>We need to change the yaml fragments in the tasks to run ceph-deploy before install.</p>
<p>This bug was noticed on this job:</p>
<p><a class="external" href="http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-25_01:10:02-ceph-deploy-firefly-distro-basic-vps/820583/teuthology.log">http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-25_01:10:02-ceph-deploy-firefly-distro-basic-vps/820583/teuthology.log</a></p> teuthology - Bug #11215 (New): The tgt task should install the packages it needs.https://tracker.ceph.com/issues/112152015-03-23T23:54:28ZAndrew Schoenaschoen@redhat.com
<p>We'd like to stop installing scsi-target-utils and iscsi-initiator-utils on rhel 6.x because it has a requirement on librbd1 and librados. This makes our ansible playbooks not idempotent because whenever we install scsi-target-utils it installs librd1 and librados. We also make sure that librd1 and librados are absent in the same playbook. When we uninstall them they uninstall scsi-target-utils which then makes the playbook reinstall scsi-target-utils causing a loop that results in 2 changed plays every time we run it.</p>
<p>See this commit to see when they were added to ceph-qa-chef:</p>
<p><a class="external" href="https://github.com/ceph/ceph-qa-chef/commit/9f7c34275e47e314e39a8b199829e019e807e297">https://github.com/ceph/ceph-qa-chef/commit/9f7c34275e47e314e39a8b199829e019e807e297</a></p> teuthology - Bug #9006 (New): fix _get_config_value_for_remote in install task https://tracker.ceph.com/issues/90062014-08-04T18:15:08ZTamilarasi muthamizhantamil.muthamizhan@inktank.com
<p>In install task, the function _get_config_value_for_remote() claims it installs the given value based on the role or 'all', it doesnt work as expected.</p>
<p>maybe we never hit this issue before as nowhere in the teuthology tests, we use this combination.</p>
<pre>
def _get_config_value_for_remote(ctx, remote, config, key):
"""
Look through config, and attempt to determine the "best" value to use for a
given key. For example, given:
config = {
'all':
{'branch': 'master'},
'branch': 'next'
}
_get_config_value_for_remote(ctx, remote, config, 'branch')
would return 'master'.
:param ctx: the argparse.Namespace object
:param remote: the teuthology.orchestra.remote.Remote object
:param config: the config dict
:param key: the name of the value to retrieve
"""
roles = ctx.cluster.remotes[remote]
if 'all' in config:
return config['all'].get(key)
elif roles:
for role in roles:
if role in config and key in config[role]:
return config[role].get(key)
return config.get(key)
sample config.yaml that i used to test,
roles:
- [mon.a, mds.a, osd.0, osd.1, osd.2]
- [client.0]
targets:
ubuntu@mira023.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCn68M4okUoKZ+8Mz6Izfh+v7IQdmoXLvO0AA2uR8dBlZobVvJ9NXvXsoIP3SkweWyUf3RIeglkCMIdZN/2Wp2xorN925RFOM9v3Iyr+FMlIZGrQZSVQUhtL7vQpwrQ0FUygk54LLgmPt8uvi0u8p+pCUKDl5bUSt+IoEbwoRK4uNToL3+WUs/O/i51/OT62X3btZgYcZD/jbhQiicHD1U9Tm8DmBxn95spn6SNTDuCoviuqxmrG+CoNPEPxAi4FxAX4vZbiOxP6Kb6UuP5XMVw1lxeLeuwr9QT8Kzw9SIPwhdDZm4bWZmbu5ZCg5TbcabBhoxNc67lIiOfQBKj60Qf
ubuntu@mira042.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQD5mVeceNoTxd9OiOWol4C4XNAXgmBn6ddzHTPr12cZWOWuoBXtDHM3wm+23v55LSXkglU6/zVAJPxipai0CzWgfjC/0cIo7eBps9KvgnSYKEyOT47GlLIWOl84dWqWbDq1lhU4MMIk+7keq5Q6LDf/Xo8+I0NGxntqH5Q8jk32Kkv4ms6GuL7GlMWr8BrfQ5V3QhXeKIV7EQiMSpMT+N4XqO0bEN3v7u0a0jckGuxnrsyCLHfWfJWs28DGWos6VnFiXV1k0rXTQYVXNzlPjFl2KqPZo9Pp+IosHakvsy60XAa4pZYGZd7dvBTOcijZ8aOvqq3rLQxf6ZBuTW9nCGet
tasks:
- chef:
- install:
all:
branch: dumpling
</pre>