Activity
From 07/03/2016 to 08/01/2016
08/01/2016
- 10:54 PM Bug #16826: SSH Error: data could not be sent to the remote host. Make sure this host can be reac...
- https://github.com/ceph/ceph-cm-ansible/pull/275
- 06:29 PM Bug #16826: SSH Error: data could not be sent to the remote host. Make sure this host can be reac...
- Potentially relevant upstream issues:
https://github.com/ansible/ansible/issues/13876
https://github.com/ansible/an... - 06:28 PM Bug #16826: SSH Error: data could not be sent to the remote host. Make sure this host can be reac...
- Attempts at getting ansible to give us more info:
https://github.com/ceph/teuthology/pull/919 - 05:24 PM Bug #16859 (Resolved): sepia access
- Additional pubkey has been added and pushed to teuthology.front. Any test runs created after this point will automat...
- 05:22 PM Feature #16858 (Need More Info): Requesting Lab Access
- Hi Ryne,
You should have access to the Sepia lab now. Please verify you're able to connect to the vpn and ssh ryn...
07/31/2016
- 01:59 AM Bug #16875 (Closed): apama001 load spikes and OSD 65 dies
- Output from dmesg when issue occurs...
07/29/2016
- 08:19 PM Bug #16860: 'sudo yum install ceph-radosgw -y' fails jobs in rados run
- See suspected https://github.com/ceph/ceph/pull/7338
- 07:55 PM Bug #16860 (Closed): 'sudo yum install ceph-radosgw -y' fails jobs in rados run
- Run: http://pulpito.ceph.com/yuriw-2016-07-29_08:07:15-rados-wip-yuri-testing2_2016_7_29-distro-basic-smithi/
Jobs: ... - 07:25 PM Bug #16859 (Resolved): sepia access
- (Original request at http://tracker.ceph.com/issues/16031)
Please add (or replace, only if necessary please) this ... - 05:03 PM Feature #16858 (Resolved): Requesting Lab Access
- 1. only requesting access to schedule jobs (and view their results)
2. username : ryneli@redhat
3. public SSH key:
... - 03:21 AM Cleanup #16825: vpm061 has odd downburst error
- Running a loop to visit all 25 vmhosts and refresh all defined pools.
- 03:12 AM Cleanup #16825: vpm061 has odd downburst error
- Yuri Weinstein wrote:
> See more in http://pulpito.ceph.com/teuthology-2016-07-26_08:18:30-smoke-master-distro-basic... - 02:40 AM Cleanup #16825: vpm061 has odd downburst error
- The problem was a corrupted storage pool for vpm061: libvirt was reporting that it contained many images, but it only...
07/28/2016
- 11:41 PM Bug #16853 (Resolved): ssh-key change on teuthology host
- My ssh key has changed on the teuthology host, i endup using ubuntu for locking/reserving nodes
Please add the bel... - 05:24 PM Bug #16826: SSH Error: data could not be sent to the remote host. Make sure this host can be reac...
- Just to be clear, we're pretty sure ssh is not connecting/reconnecting at the time of the failures, so the above is j...
- 08:13 AM Support #16843 (Resolved): Request for sepia lab access
- Hi,
I'm requesting access for scheduling jobs in the sepia lab.
Username: rdias
Public key:...
07/27/2016
- 02:24 PM Bug #16826: SSH Error: data could not be sent to the remote host. Make sure this host can be reac...
- Dan Mick wrote:
> It looks like the default ansible.cfg ssh connection timeout is 10s, which seems pretty short, esp... - 02:11 AM Bug #16826: SSH Error: data could not be sent to the remote host. Make sure this host can be reac...
- looks like ansible-playbook takes a -T/--timeout that might be an easy way to play with this
- 02:10 AM Bug #16826: SSH Error: data could not be sent to the remote host. Make sure this host can be reac...
- It looks like the default ansible.cfg ssh connection timeout is 10s, which seems pretty short, especially if the orig...
- 02:03 AM Bug #16826: SSH Error: data could not be sent to the remote host. Make sure this host can be reac...
- David Galloway wrote:
> * With the load so high, the ansible run prior to the actual teuthology test takes an unacce... - 01:52 AM Bug #16826 (Resolved): SSH Error: data could not be sent to the remote host. Make sure this host ...
- I've noticed an abnormally high number of jobs failing due to ssh failures during ceph-cm-ansible runs. I haven't be...
- 02:28 AM Cleanup #16825: vpm061 has odd downburst error
- See more in http://pulpito.ceph.com/teuthology-2016-07-26_08:18:30-smoke-master-distro-basic-vps/
334828, 334829, ... - 01:54 AM Cleanup #16825: vpm061 has odd downburst error
- I think I remember seeing this error during my mira drive party last week. It may be lab-wide and not just limited t...
- 12:31 AM Cleanup #16825 (Resolved): vpm061 has odd downburst error
- ...
07/26/2016
- 06:48 PM Support #16713: Requesting lab access
- Radoslaw,
You're all set. Please verify you can connect to the VPN and ssh rzarzynski@teuthology.front.sepia.ceph... - 02:50 PM Bug #16816 (Resolved): teuthology-logs.public.ceph.com not reachable
- The virtual machine was shutdown, it's rebooted and the service is back. Thanks for noticing !
- 02:05 PM Bug #16816 (Resolved): teuthology-logs.public.ceph.com not reachable
- teuthology-openstack has a feature called ...
07/25/2016
- 08:16 PM Bug #14840 (In Progress): mira091 is not accessible
- This system's got at least 1 bad DIMM according to SEL. Will have lab team diagnose and replace.
I've marked down... - 03:59 PM Bug #16810 (Resolved): re-image smithi044
- this node seems to be acting up on every run, pls re-image
07/22/2016
- 06:00 PM Support #16713 (In Progress): Requesting lab access
- 06:00 PM Tasks #15389 (Resolved): read error on mira055 (osd.74):sdf1
- Drive replaced and new drive added to cluster
- 05:50 PM Bug #14478 (Resolved): mira089 MCE, bad processor?
- Machine passed last 3 jobs. Total job stats don't appear abnormal
- 05:47 PM Bug #14546: mira033 kernel panic from MCE
- Tested DIMMs and didn't find a bad one. If MCEs persist, will retire machine.
- 05:45 PM Bug #16238 (Resolved): Input/output error on mira023
- Machine reimaged. All drives present as healthy.
- 05:43 PM Bug #16326 (Resolved): mira033, mira052 fried memory
- Tested all 8 DIMMs individually and found 1 bad. Replaced with spare and reimaged hosts.
- 05:42 PM Feature #16669 (Resolved): Sepia Status page
- http://status.sepia.ceph.com/
- 05:40 PM Bug #16720 (Resolved): mira038 losing ssh connectivity after reboot
- Reimaged and released
- 05:08 PM Bug #15147: mira095 RAID6 degraded
- Drive 5 failing
- 03:50 PM Cleanup #14528: Track down usage and purpose of mira{123..126} aka dubia{001..004}
- jenkins.front can be repurposed
7OCT2016 - Just shut it down.
07/21/2016
- 10:10 PM Cleanup #14528: Track down usage and purpose of mira{123..126} aka dubia{001..004}
- Replaced drive 4 in jenkins. Its RAID is degraded and underlying data may be lost. I can't get smart data from the ...
- 09:16 PM Feature #15563 (Resolved): reduce email noise from nightlies crontab entries
- 05:32 PM Bug #16765 (Resolved): can't nuke mira046
- Stuck at :...
07/19/2016
- 04:58 PM Bug #16728: Cannot find a valid baseurl for repo: base/7/x86_6
- Here's another failure: http://qa-proxy.ceph.com/teuthology/teuthology-2016-07-18_18:10:02-upgrade:infernalis-inferna...
- 12:10 AM Bug #16728: Cannot find a valid baseurl for repo: base/7/x86_6
- Cannot find a valid baseurl for repo: base/7/x86_6 is the failure marker. Updating title to reflect.
07/18/2016
- 10:54 PM Bug #16728: Cannot find a valid baseurl for repo: base/7/x86_6
- related: https://github.com/ceph/ceph-cm-ansible/pull/268
- 10:36 PM Bug #16728: Cannot find a valid baseurl for repo: base/7/x86_6
- Actual issue is a yum failure. We're working on increasing loglevel for yum transactions so we can debug what's actu...
- 10:30 PM Bug #16728 (Can't reproduce): Cannot find a valid baseurl for repo: base/7/x86_6
- Six of the jobs in this run failed in this way, for example:
http://pulpito.ceph.com/teuthology-2016-07-13_02:10:02-... - 09:58 PM Bug #16724 (Resolved): ceph-ansible suite: failed jobs due to git clone failed error
- The ceph-ansible repo wasn't mirrored on git.ceph.com. I've added it and it's now mirrored.
http://git.ceph.com/?... - 09:46 PM Bug #16724 (Resolved): ceph-ansible suite: failed jobs due to git clone failed error
- the ceph-ansible suite has failed jobs due to "git clone failed error"
pasting below the excerpt from teuthology.l... - 08:45 PM Bug #16719 (Resolved): smith006.ipmi.sepia.ceph.com is in bad state
- You had a typo in your ipmitool command.
I reimaged the host anyway since it was in a weird state. - 05:21 PM Bug #16719 (Resolved): smith006.ipmi.sepia.ceph.com is in bad state
- ipmitool -H smith006.ipmi.sepia.ceph.com -I XXXXXX power cycle
Address lookup for smith006.ipmi.sepia.ceph.com faile... - 05:34 PM Bug #16720 (Resolved): mira038 losing ssh connectivity after reboot
- marked it down
nuke/stale has this error:... - 04:13 PM Support #16713 (Resolved): Requesting lab access
- 1. Access type: scheduling jobs.
2. Username:... - 03:38 PM Bug #16711 (Duplicate): sudo yum install -y kernel fails
- http://pulpito.ceph.com/teuthology-2016-07-17_04:20:03-upgrade:jewel-x-master-distro-basic-vps/319488/...
- 02:55 PM Bug #15297 (Closed): kernel yum install task failed due to apparent dns failure on gitbuilder.cep...
- 07:00 AM Bug #15297: kernel yum install task failed due to apparent dns failure on gitbuilder.ceph.com
07/12/2016
- 10:01 PM Feature #16669: Sepia Status page
- One option I intend to install and test out: https://cachethq.io/
- 09:28 PM Feature #16669 (Resolved): Sepia Status page
- gmeno mentioned it'd be nice to have a (non-nagios) Lab Status / uptime page to track lab, queue, suite, etc. statuse...
- 03:11 PM Bug #11571 (Closed): "Bad hostname 'magnaXXX.front.sepia.ceph.com'" error in rgw-firefly-distro-b...
- Whatever caused this is almost certainly fixed now. Closing since it's been over a year.
07/07/2016
- 07:39 PM Bug #16615 (Resolved): Failed to download remote objects and refs: fatal: shallow file was change...
- 06:45 PM Bug #16615 (Fix Under Review): Failed to download remote objects and refs: fatal: shallow file wa...
- https://github.com/ceph/ceph-cm-ansible/pull/264
- 06:42 PM Bug #16615 (In Progress): Failed to download remote objects and refs: fatal: shallow file was cha...
- 06:40 PM Bug #16615: Failed to download remote objects and refs: fatal: shallow file was changed during fetch
- The only thing I can think of doing about this is not to use a shallow clone. Looks like it might not make much of a ...
- 06:34 PM Bug #16615 (Resolved): Failed to download remote objects and refs: fatal: shallow file was change...
- Jobs are sporadically failing due to $subject.
See http://pulpito.ceph.com/cbodley-2016-07-07_11:21:33-rgw-wip-rgw...
Also available in: Atom