Project

General

Profile

Actions

Bug #11528

closed

mira015 has bad disks 1,3,6

Added by Yuri Weinstein almost 9 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-deploy
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2015-05-04_08:01:12-ceph-deploy-giant-distro-basic-vps/
Jobs: ['874956', '874957', '874958', '874960', '874961', '874963', '874965', '874966', '874967', '874968', '874969', '874970', '874971']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-05-04_08:01:12-ceph-deploy-giant-distro-basic-vps/874956/

2015-05-04T09:05:53.771 INFO:teuthology.provision:Provisioning a ubuntu 12.04 vps
2015-05-04T09:05:54.064 INFO:teuthology.provision:Downburst failed on ubuntu@vpm091.front.sepia.ceph.com: libvir: RPC error : Cannot recv data: Read from socket failed: Connection reset by peer
: Connection reset by peer
Traceback (most recent call last):
  File "/home/ubuntu/src/downburst/virtualenv/bin/downburst", line 9, in <module>
    load_entry_point('downburst==0.0.1', 'console_scripts', 'downburst')()
  File "/home/ubuntu/src/downburst/downburst/cli.py", line 59, in main
    return args.func(args)
  File "/home/ubuntu/src/downburst/downburst/create.py", line 22, in create
    conn = libvirt.open(args.connect)
  File "/usr/lib/python2.7/dist-packages/libvirt.py", line 236, in open
    if ret is None:raise libvirtError('virConnectOpen() failed')
libvirt.libvirtError: Cannot recv data: Read from socket failed: Connection reset by peer
: Connection reset by peer
2015-05-04T09:05:54.064 ERROR:teuthology.lock:Unable to create virtual machine: ubuntu@vpm091.front.sepia.ceph.com
2015-05-04T09:05:54.407 ERROR:teuthology.provision:Error destroying vpm091.front.sepia.ceph.com: libvir: RPC error : Cannot recv data: Read from socket failed: Connection reset by peer
: Connection reset by peer
Traceback (most recent call last):
  File "/home/ubuntu/src/downburst/virtualenv/bin/downburst", line 9, in <module>
    load_entry_point('downburst==0.0.1', 'console_scripts', 'downburst')()
  File "/home/ubuntu/src/downburst/downburst/cli.py", line 59, in main
    return args.func(args)
  File "/home/ubuntu/src/downburst/downburst/destroy.py", line 42, in destroy
    conn = libvirt.open(args.connect)
  File "/usr/lib/python2.7/dist-packages/libvirt.py", line 236, in open
    if ret is None:raise libvirtError('virConnectOpen() failed')
libvirt.libvirtError: Cannot recv data: Read from socket failed: Connection reset by peer
: Connection reset by peer

2015-05-04T09:05:54.408 ERROR:teuthology.lock:downburst destroy failed for vpm091.front.sepia.ceph.com
Actions #1

Updated by Yuri Weinstein almost 9 years ago

I rescheduled a run and it worked, FYI, however several runs failed due similar errors.

Actions #2

Updated by Yuri Weinstein almost 9 years ago

  • Priority changed from Urgent to Normal
Actions #3

Updated by Yuri Weinstein almost 9 years ago

For example in this job:
http://qa-proxy.ceph.com/teuthology/teuthology-2015-05-04_09:43:38-ceph-deploy-giant-distro-basic-vps/874975/teuthology.log
there are still similar error, however finally a different vm was locked FYI

Actions #4

Updated by Zack Cerza almost 9 years ago

  • Project changed from devops to sepia
  • Subject changed from "Downburst failed on ...: libvir: RPC error" in ceph-deploy-giant-distro-basic-vps run to mira015 is broken

mira015's port 22 is closed. I'm power cycling it and in the meantime I marked its vms down.

Actions #5

Updated by Zack Cerza almost 9 years ago

ubuntu@mira015:/var/log$ /usr/libexec/smart.pl
Drive 1 has 875 reallocated sectors;     Drive 1 has 4 uncorrect sectors;     Drive 1 has 16 pending sectors;     Drive 1 has 16 pending sectors;     Drive 3 has 247 reallocated sectors;     Drive 6 has 207 uncorrect sectors
Actions #6

Updated by Zack Cerza almost 9 years ago

  • Subject changed from mira015 is broken to mira015 has bad disks

Marked mira015 down with a useful description

Actions #7

Updated by Dan Mick almost 9 years ago

  • Assignee set to Dan Mick
  • Regression set to No
Actions #8

Updated by Dan Mick almost 9 years ago

  • Subject changed from mira015 has bad disks to mira015 has bad disks 1,3,6
Actions #9

Updated by Dan Mick almost 9 years ago

  • Status changed from New to In Progress
Actions #10

Updated by Dan Mick over 8 years ago

I don't have a separate record of when, but mira015's disks all look happy now, and it's just waiting for vmhost setup.

Actions #11

Updated by Dan Mick over 8 years ago

vpm089..096.

Actions #12

Updated by David Galloway about 8 years ago

  • Status changed from In Progress to Resolved

Drives were replaced, system reimaged and VPSes are functional

Actions

Also available in: Atom PDF