Project

General

Profile

Bug #41660

ceph-volume lvm activate -all fails if stderr is not a terminal

Added by Paul Emmerich over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

root@ct-2a ~ $ ceph-volume lvm activate --all > stdout 2> stderr
root@ct-2a ~ $ cat stdout
root@ct-2a ~ $ cat stderr
--> OSD ID 14 FSID 2f8651bb-d404-44bf-b4d2-67c1aa3d5be1 process is active. Skipping activation
--> Activating OSD ID 37 FSID 43f3e81c-de7c-4730-b0c5-331154368e35
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-37
Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-c8fbfe66-dcbb-4fb3-ade5-c583e72e6625/osd-block-43f3e81c-de7c-4730-b0c5-331154368e35 --path /var/lib/ceph/osd/ceph-37 --no-mon-config
Running command: /bin/ln -snf /dev/ceph-c8fbfe66-dcbb-4fb3-ade5-c583e72e6625/osd-block-43f3e81c-de7c-4730-b0c5-331154368e35 /var/lib/ceph/osd/ceph-37/block
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-37/block
Running command: /bin/chown -R ceph:ceph /dev/dm-3
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-37
Running command: /bin/systemctl enable ceph-volume@lvm-37-43f3e81c-de7c-4730-b0c5-331154368e35
Running command: /bin/systemctl enable --runtime ceph-osd@37
Running command: /bin/systemctl start ceph-osd@37
--> ceph-volume lvm activate successful for osd ID: 37
--> Activating OSD ID 40 FSID b88cd1a7-10b4-4b1a-845e-db271f6a8971
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-40
Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-3d649f22-e127-408d-851a-944097eb6dc7/osd-block-b88cd1a7-10b4-4b1a-845e-db271f6a8971 --path /var/lib/ceph/osd/ceph-40 --no-mon-config
Running command: /bin/ln -snf /dev/ceph-3d649f22-e127-408d-851a-944097eb6dc7/osd-block-b88cd1a7-10b4-4b1a-845e-db271f6a8971 /var/lib/ceph/osd/ceph-40/block
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-40/block
Running command: /bin/chown -R ceph:ceph /dev/dm-4
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-40
Running command: /bin/systemctl enable ceph-volume@lvm-40-b88cd1a7-10b4-4b1a-845e-db271f6a8971
Running command: /bin/systemctl enable --runtime ceph-osd@40
-->  UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 87: ordinal not in range(128)

This always happens when activating the second OSD, so just re-running it often enough works.

Running just root@ct-2a ~ $ ceph-volume lvm activate --all without redirects works fine.

Happens since 14.2.3

History

#1 Updated by Alfredo Deza over 4 years ago

Following up from the email thread...

Is it possible that the locale is set to something that is not
en_US.UTF-8 ? I was able to replicate some failures with LC_ALL=C

Another thing I would try is to enable debug (or show/paste the
traceback from the logs) so that tracebacks are immediately available in the output:

CEPH_VOLUME_DEBUG=1 ceph-volume lvm activate --all

#2 Updated by Paul Emmerich over 4 years ago

Forgot to add the stack trace:

[2019-09-04 19:46:34,601][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/main.py", line 148, in main
    terminal.dispatch(self.mapper, subcommand_args)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 205, in dispatch
    instance.main()
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/main.py", line 40, in main
    terminal.dispatch(self.mapper, self.argv)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 205, in dispatch
    instance.main()
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 339, in main
    self.activate_all(args)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 232, in activate_all
    self.activate(args, osd_id=osd_id, osd_fsid=osd_fsid)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 265, in activate
    activate_bluestore(lvs, no_systemd=args.no_systemd)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 192, in activate_bluestore
    systemctl.enable_volume(osd_id, osd_fsid, 'lvm')
  File "/usr/lib/python2.7/dist-packages/ceph_volume/systemd/systemctl.py", line 82, in enable_volume
    return enable(volume_unit % (device_type, id_, fsid))
  File "/usr/lib/python2.7/dist-packages/ceph_volume/systemd/systemctl.py", line 22, in enable
    process.run(['systemctl', 'enable', unit])
  File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 137, in run
    log_descriptors(reads, process, terminal_logging)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 59, in log_descriptors
    log_output(descriptor_name, message, terminal_logging, True)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 32, in log_output
    getattr(terminal, descriptor)(message)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 130, in stderr
    return _Write(prefix=yellow(' stderr: ')).raw(msg)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 117, in raw
    self.write(string)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 120, in write
    self._writer.write(self.prefix + line + self.suffix)
  File "/usr/lib/python2.7/codecs.py", line 369, in write
    data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 133: ordinal not in range(128)

#3 Updated by Paul Emmerich over 4 years ago

Locale was en_US.UTF-8, pretty sure the script where we've triggered this initially runs with C

#4 Updated by Paul Emmerich over 4 years ago

the message that causes it to fail is:

stderr: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@41.service → /lib/systemd/system/ceph-osd.service.@

That arrow is encoded as \xe2\x86\x92

The relevant commit that triggers this is probably https://github.com/ceph/ceph/commit/b8d6dcbe9f803c96c0af68da54f1262e9b6a9e77

#5 Updated by Paul Emmerich over 4 years ago

Easier way to reproduce this without actually needing to activate OSDs or anything, just add this line somewhere in the lvm list subcommand

process.run(['echo', '→'])

root@ct-4b ~ $ ceph-volume lvm list
Running command: /bin/echo →
 stdout: →
root@ct-4b ~ $ ceph-volume lvm list 2> stderr
root@ct-4b ~ $ cat stderr
Running command: /bin/echo →
-->  UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 9: ordinal not in range(128)

#6 Updated by Alfredo Deza over 4 years ago

  • Status changed from New to 12
  • Assignee set to Alfredo Deza
  • Priority changed from Normal to Immediate

Thanks Paul, I will follow up with a fix (or hopefully a workaround) asap

#7 Updated by Alfredo Deza over 4 years ago

  • Status changed from 12 to In Progress

Paul, can you try this patch?

diff --git a/src/ceph-volume/ceph_volume/terminal.py b/src/ceph-volume/ceph_volume/terminal.py
index aaff47962d..cb727cf3f5 100644
--- a/src/ceph-volume/ceph_volume/terminal.py
+++ b/src/ceph-volume/ceph_volume/terminal.py
@@ -82,9 +82,10 @@ class _Write(object):
     def __init__(self, _writer=None, prefix='', suffix='', flush=False):
         # we can't set sys.stderr as the default for _writer. Otherwise
         # pytest's capturing gets confused
-        if not _writer:
-            _writer = sys.stderr
-        self._writer = _Write._unicode_output_stream(_writer)
+        if not sys.__stderr__.isatty():
+            self._writer = sys.stderr
+        else:
+            self._writer = _Write._unicode_output_stream(sys.stderr)
         sys.stderr = self._writer
         self.suffix = suffix
         self.prefix = prefix

#8 Updated by Paul Emmerich over 4 years ago

No, that leads to a new error message:

Running command: /bin/systemctl enable --runtime ceph-osd@48
Traceback (most recent call last):
  File "/usr/sbin/ceph-volume", line 11, in <module>
    load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
  File "/usr/lib/python2.7/dist-packages/ceph_volume/main.py", line 39, in __init__
    self.main(self.argv)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/main.py", line 149, in main
    terminal.dispatch(self.mapper, subcommand_args)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 210, in dispatch
    instance.main()
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/main.py", line 40, in main
    terminal.dispatch(self.mapper, self.argv)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 210, in dispatch
    instance.main()
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 339, in main
    self.activate_all(args)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 232, in activate_all
    self.activate(args, osd_id=osd_id, osd_fsid=osd_fsid)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 265, in activate
    activate_bluestore(lvs, no_systemd=args.no_systemd)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 195, in activate_bluestore
    systemctl.enable_osd(osd_id)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/systemd/systemctl.py", line 70, in enable_osd
    return enable(osd_unit % id_, runtime=True)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/systemd/systemctl.py", line 20, in enable
    process.run(['systemctl', 'enable', '--runtime', unit])
  File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 139, in run
    log_descriptors(reads, process, terminal_logging)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 61, in log_descriptors
    log_output(descriptor_name, message, terminal_logging, True)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 32, in log_output
    getattr(terminal, descriptor)(message)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 135, in stderr
    return _Write(prefix=yellow(' stderr: ')).raw(msg)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 121, in raw
    self.write(string)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 125, in write
    self._writer.write(msg.decode("utf-8"))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2192' in position 87: ordinal not in range(128)

(line numbers in stack trace might not match, i've some random prints in there)

#9 Updated by Alfredo Deza over 4 years ago

I don't see this line anywhere:

    self._writer.write(msg.decode("utf-8"))

Is this your local editing? is it possible for you to try to remove all your edits and apply the patch cleanly?

#10 Updated by Paul Emmerich over 4 years ago

Sorry, yes, there were still random things I've tried in there.

Yes, that fixes it

#11 Updated by Blaine Gardner over 4 years ago

Could this be related to an issue we're seeing in Rook on `ceph volume lvm prepare`?

https://github.com/rook/rook/issues/3795

#12 Updated by Paul Emmerich over 4 years ago

This probably breaks every deployment tool, yes.

#13 Updated by Alfredo Deza over 4 years ago

  • Status changed from In Progress to Fix Under Review

PR to master https://github.com/ceph/ceph/pull/30274

Paul: the PR is a bit different from what I initially suggested as the fix, it would mean a lot if you could try it out and check it all works for you.

#14 Updated by Nathan Cutler over 4 years ago

  • Pull request ID set to 30274

#15 Updated by Alfredo Deza over 4 years ago

  • Status changed from Fix Under Review to Resolved

#16 Updated by Alfredo Deza over 4 years ago

14.2.4 has been released with this fix

Also available in: Atom PDF