Project

General

Profile

Actions

Bug #5405

closed

ceph-deploy: transient pushy exception on install

Added by Sage Weil almost 11 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
ceph-deploy
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

flab:ceph-deploy 09:22 AM $ ./ceph-deploy install mira09{4,5,6}
OK
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 504, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/sage/src/ceph-deploy/virtualenv/local/lib/python2.7/site-packages/pushy-0.5.1-py2.7.egg/pushy/protocol/baseconnection.py", line 268, in serve_forever
    self.__handle(m)
  File "/home/sage/src/ceph-deploy/virtualenv/local/lib/python2.7/site-packages/pushy-0.5.1-py2.7.egg/pushy/protocol/baseconnection.py", line 653, in __handle
    self.__send_message(MessageType.exception, e)
  File "/home/sage/src/ceph-deploy/virtualenv/local/lib/python2.7/site-packages/pushy-0.5.1-py2.7.egg/pushy/protocol/baseconnection.py", line 560, in __send_message
    self.__ostream.send_message(m)
  File "/home/sage/src/ceph-deploy/virtualenv/local/lib/python2.7/site-packages/pushy-0.5.1-py2.7.egg/pushy/protocol/baseconnection.py", line 97, in send_message
    self.__file.write(bytes_)
ValueError: I/O operation on closed file

flab:ceph-deploy 09:22 AM $ ./ceph-deploy install mira09{4,5,6}
Actions #1

Updated by Tamilarasi muthamizhan almost 11 years ago

on centos 6.4,

[ubuntu@burnupi63 ceph-deploy]$ ./ceph-deploy install burnupi63
##################################################
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/lib64/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/ubuntu/cdep/ceph-deploy/virtualenv/lib/python2.6/site-packages/pushy-0.5.1-py2.6.egg/pushy/protocol/baseconnection.py", line 268, in serve_forever
    self.__handle(m)
  File "/home/ubuntu/cdep/ceph-deploy/virtualenv/lib/python2.6/site-packages/pushy-0.5.1-py2.6.egg/pushy/protocol/baseconnection.py", line 653, in __handle
    self.__send_message(MessageType.exception, e)
  File "/home/ubuntu/cdep/ceph-deploy/virtualenv/lib/python2.6/site-packages/pushy-0.5.1-py2.6.egg/pushy/protocol/baseconnection.py", line 560, in __send_message
    self.__ostream.send_message(m)
  File "/home/ubuntu/cdep/ceph-deploy/virtualenv/lib/python2.6/site-packages/pushy-0.5.1-py2.6.egg/pushy/protocol/baseconnection.py", line 97, in send_message
    self.__file.write(bytes_)
ValueError: I/O operation on closed file

Actions #2

Updated by Ian Colle almost 11 years ago

  • Priority changed from High to Urgent
Actions #3

Updated by Alfredo Deza almost 11 years ago

This is very hard to replicate, but after adding a bit of verbosity in pushy, I managed to narrow it down to the PushyClient performing a connection close before the thread finished.

The `Duplicate source.list` warnings may (or may not) be of substance, but it is clear that as the `close()` method is called the thread attempts to keep writing.

$ ceph-deploy -v install node{1,2}
Installing stable version cuttlefish on cluster ceph hosts node1 node2
args.host: ['node1', 'node2']
hostname is: node1
Detecting platform for host node1 ...
compiling source
compiling source
Distro Ubuntu release 12.04 codename precise
compiling source
Installing on host node1 ...
W: Duplicate sources.list entry http://ceph.com/debian-cuttlefish/ precise/main amd64 Packages (/var/lib/apt/lists/ceph.com_debian-cuttlefish_dists_precise_main_binary-amd64_Packages)
W: Duplicate sources.list entry http://ceph.com/debian-cuttlefish/ precise/main i386 Packages (/var/lib/apt/lists/ceph.com_debian-cuttlefish_dists_precise_main_binary-i386_Packages)
W: You may want to run apt-get update to correct these problems
closing the connection
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 552, in __bootstrap_inner
    self.run()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 505, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/alfredo/.virtualenvs/ceph-deploy/lib/python2.7/site-packages/pushy-0.5.1-py2.7.egg/pushy/protocol/baseconnection.py", line 268, in serve_forever
    self.__handle(m)
  File "/Users/alfredo/.virtualenvs/ceph-deploy/lib/python2.7/site-packages/pushy-0.5.1-py2.7.egg/pushy/protocol/baseconnection.py", line 653, in __handle
    self.__send_message(MessageType.exception, e)
  File "/Users/alfredo/.virtualenvs/ceph-deploy/lib/python2.7/site-packages/pushy-0.5.1-py2.7.egg/pushy/protocol/baseconnection.py", line 560, in __send_message
    self.__ostream.send_message(m)
  File "/Users/alfredo/.virtualenvs/ceph-deploy/lib/python2.7/site-packages/pushy-0.5.1-py2.7.egg/pushy/protocol/baseconnection.py", line 97, in send_message
    self.__file.write(bytes_)
ValueError: I/O operation on closed file
Actions #4

Updated by Alfredo Deza almost 11 years ago

I've opened a pull request for pushy (https://github.com/axw/pushy/pull/42) that should solve this problem.

What happens (as detailed in the pull request) is that the `close()` method will close the connection first and then join the threads, but sometimes a thread will want to write to the file descriptor but it will not be able to do so because the closing of the connection closes the file as well.

The BaseConnection class already catches IOError but does not account for ValueError that is related (as in this case).

In the meantime, we can sub-class and fix this in ceph-deploy until the pull request gets merged and a release gets cut.

Actions #5

Updated by Sage Weil almost 11 years ago

  • Assignee set to Anonymous
Actions #6

Updated by Sage Weil almost 11 years ago

  • Status changed from New to 15

once this his an upstream we need to build a package.

Actions #7

Updated by Alfredo Deza almost 11 years ago

  • Assignee changed from Anonymous to Alfredo Deza
Actions #8

Updated by Alfredo Deza almost 11 years ago

  • Assignee changed from Alfredo Deza to Anonymous
Actions #9

Updated by Sage Weil over 10 years ago

  • Status changed from 15 to Resolved
Actions

Also available in: Atom PDF