Project

General

Profile

Actions

Bug #8040

closed

"'Bad file descriptor'" in "kernel.py", line 533, in wait_for_reboot"

Added by Yuri Weinstein about 10 years ago. Updated over 9 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Sandon Van Ness
Category:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

I see several errors like that that likely due to insufficient wait time after reboot.
Maybe of low priority, but errors in logs are unwelcome!

In http://pulpito.front.sepia.ceph.com/teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/

One example, logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/177692/

2014-04-08T02:12:57.359 DEBUG:teuthology.orchestra.run:Running [10.214.138.165]: 'sudo lsb_release -is'
2014-04-08T02:12:57.460 DEBUG:teuthology.misc:System to be installed: Ubuntu
2014-04-08T02:12:57.460 DEBUG:teuthology.orchestra.run:Running [10.214.138.165]: 'sudo apt-get -y install linux-image-current-generic'
2014-04-08T02:13:07.140 DEBUG:teuthology.orchestra.run:Running [10.214.138.165]: 'dpkg -s linux-image-current-generic'
2014-04-08T02:13:08.198 DEBUG:teuthology.orchestra.run:Running [10.214.138.165]: 'dpkg -s linux-image-generic-lts-raring'
2014-04-08T02:13:08.223 INFO:teuthology.task.kernel:Checking client client.0 for new kernel version...
2014-04-08T02:13:08.223 ERROR:teuthology.task.kernel:Saw exception
Traceback (most recent call last):
  File "/home/teuthworker/teuthology-firefly/teuthology/task/kernel.py", line 533, in wait_for_reboot
    assert not need_to_install_distro(ctx, client), \
  File "/home/teuthworker/teuthology-firefly/teuthology/task/kernel.py", line 556, in need_to_install_distro
    system_type = teuthology.get_system_type(role_remote)
  File "/home/teuthworker/teuthology-firefly/teuthology/misc.py", line 1146, in get_system_type
    stdout=StringIO(),
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/remote.py", line 106, in run
    r = self._runner(client=self.ssh, **kwargs)
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 267, in run
    r = execute(client, args)
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 79, in execute
    (host, port) = client.get_transport().getpeername()
  File "/usr/lib/python2.7/dist-packages/paramiko/transport.py", line 1371, in getpeername
    return gp()
  File "<string>", line 1, in getpeername
  File "/usr/lib/python2.7/dist-packages/gevent/socket.py", line 268, in _dummy
    raise error(EBADF, 'Bad file descriptor')
error: [Errno 9] Bad file descriptor
2014-04-08T02:13:09.223 INFO:teuthology.misc:Re-opening connections...
archive_path: /var/lib/teuthworker/archive/teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/177692
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/rados_api_tests.yaml
  6-next-mon/monb.yaml 7-workload/radosbench.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml
  rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/ubuntu_12.04.yaml}
email: null
job_id: '177692'
kernel:
  kdb: true
  sha1: distro
last_in_suite: false
machine_type: vps
name: teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps
nuke-on-error: true
os_type: ubuntu
os_version: '12.04'
overrides:
  admin_socket:
    branch: firefly
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
        mon warn on legacy crush tunables: false
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    log-whitelist:
    - slow request
    - wrongly marked me down
    - objects unfound and apparently lost
    - log bound mismatch
    sha1: 010dff12c38882238591bb042f8e497a1f7ba020
  ceph-deploy:
    branch:
      dev: firefly
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      sha1: 010dff12c38882238591bb042f8e497a1f7ba020
  s3tests:
    branch: master
  workunit:
    sha1: 010dff12c38882238591bb042f8e497a1f7ba020
owner: scheduled_teuthology@teuthology
roles:
- - mon.a
  - mon.b
  - mds.a
  - osd.0
  - osd.1
  - osd.2
- - osd.3
  - osd.4
  - osd.5
  - mon.c
- - client.0
tasks:
- chef: null
- clock.check: null
- install:
    branch: dumpling
- ceph:
    fs: xfs
- install.upgrade:
    osd.0: null
- ceph.restart:
    daemons:
    - osd.0
    - osd.1
    - osd.2
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    thrash_primary_affinity: false
    timeout: 1200
- ceph.restart:
    daemons:
    - mon.a
    wait-for-healthy: false
    wait-for-osds-up: true
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test-upgrade-firefly.sh
- ceph.restart:
    daemons:
    - mon.b
    wait-for-healthy: false
    wait-for-osds-up: true
- radosbench:
    clients:
    - client.0
    time: 1800
- install.upgrade:
    mon.c: null
- ceph.restart:
    daemons:
    - mon.c
    wait-for-healthy: false
    wait-for-osds-up: true
- ceph.wait_for_mon_quorum:
  - a
  - b
  - c
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test-upgrade-firefly.sh
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rbd/test_librbd_python.sh
- rgw:
    client.0:
      idle_timeout: 120
- swift:
    client.0:
      rgw_server: client.0
- rados:
    clients:
    - client.0
    objects: 500
    op_weights:
      delete: 50
      read: 100
      rollback: 50
      snap_create: 50
      snap_remove: 50
      write: 100
    ops: 4000
teuthology_branch: firefly
verbose: true
worker_log: /var/lib/teuthworker/archive/worker_logs/worker.vps.17024
Actions #1

Updated by Loïc Dachary almost 10 years ago

Appears to be non fatal for http://qa-proxy.ceph.com/teuthology/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338941/teuthology.log

2014-07-02T22:41:43.880 INFO:teuthology.orchestra.run.vpm056:Running: 'uname -r'
2014-07-02T22:41:43.887 INFO:teuthology.task.kernel:Checking client client.0 for new kernel version...
2014-07-02T22:41:43.888 INFO:teuthology.task.kernel:Checking kernel version of client.0, want 3.16.0-rc2-ceph-00019-g8102ce7...
2014-07-02T22:41:43.888 ERROR:teuthology.task.kernel:Saw exception
Traceback (most recent call last):
  File "/home/teuthworker/teuthology-master/teuthology/task/kernel.py", line 571, in wait_for_reboot
    assert not need_to_install(ctx, client, need_install[client]), \
  File "/home/teuthworker/teuthology-master/teuthology/task/kernel.py", line 149, in need_to_install
    stdout=uname_fp,
  File "/home/teuthworker/teuthology-master/teuthology/orchestra/cluster.py", line 64, in run
    return [remote.run(**kwargs) for remote in remotes]
  File "/home/teuthworker/teuthology-master/teuthology/orchestra/remote.py", line 114, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/teuthology-master/teuthology/orchestra/run.py", line 356, in run
    (host, port) = client.get_transport().getpeername()
  File "/usr/lib/python2.7/dist-packages/paramiko/transport.py", line 1371, in getpeername
    return gp()
  File "<string>", line 1, in getpeername
  File "/usr/lib/python2.7/dist-packages/gevent/socket.py", line 268, in _dummy
    raise error(EBADF, 'Bad file descriptor')
error: [Errno 9] Bad file descriptor
2014-07-02T22:41:44.887 INFO:teuthology.misc:Re-opening connections...
2014-07-02T22:41:44.887 INFO:teuthology.misc:trying to connect to ubuntu@vpm055.front.sepia.ceph.com
2014-07-02T22:41:44.887 INFO:teuthology.orchestra.connection:{'username': u'ubuntu', 'hostname': u'vpm055.front.sepia.ceph.com', 'timeout': 60}
2014-07-02T22:41:44.995 INFO:teuthology.orchestra.run.vpm055:Running: 'true'
2014-07-02T22:41:45.001 INFO:teuthology.misc:trying to connect to ubuntu@vpm056.front.sepia.ceph.com
2014-07-02T22:41:45.002 INFO:teuthology.orchestra.connection:{'username': u'ubuntu', 'hostname': u'vpm056.front.sepia.ceph.com', 'timeout': 60}
2014-07-02T22:41:45.110 INFO:teuthology.orchestra.run.vpm056:Running: 'true'

Actions #2

Updated by Sage Weil almost 10 years ago

  • Project changed from Ceph to teuthology
  • Source changed from other to Q/A
Actions #3

Updated by Ian Colle over 9 years ago

  • Assignee set to Sandon Van Ness
Actions #4

Updated by Sage Weil over 9 years ago

  • Status changed from New to Rejected

non-fatal most of hte time

the original run is not present in /a ? if this happens again, reopen

Actions

Also available in: Atom PDF