Project

General

Profile

Actions

Bug #53060

closed

Unable to load libceph_snappy.so due to undefined symbol _ZTIN6snappy6SourceE in snappy 1.1.9

Added by Tim Serong over 2 years ago. Updated 10 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If you try to run Ceph with snappy 1.1.9 installed, ceph status will show HEALTH_WARN, and tell you that your OSDs "have broken BlueStore compression". ceph health detail will tell you that each of your OSDs is "unable to load:snappy". The OSD logs will show something like this:

Oct 27 08:55:33 node1 ceph-osd[561817]: load failed dlopen(): "/usr/lib64/ceph/compressor/libceph_snappy.so: undefined symbol: _ZTIN6snappy6SourceE" or "/usr/lib64/ceph/libceph_snappy.so: cannot open shared object file: No such file or directory" 
Oct 27 08:55:33 node1 ceph-osd[561817]: create cannot load compressor of type snappy

This is because RTTI was disabled in snappy 1.1.9, so the typeinfo for the snappy::Source class - which Ceph's SnappyCompressor creates a subclass of - isn't included in libsnappy.so. Ceph still builds just fine, because the compressors are built as shared libraries. The problem only manifests when our snappy plugin is dlopen()ed at runtime, and then the linker kicks in and can't find that missing symbol.

This would ideally be fixed by getting RTTI re-enabled in snappy, so I've gone ahead and opened https://github.com/google/snappy/pull/144

Actions #1

Updated by Tim Serong over 2 years ago

Upstream snappy has rejected my PR to re-enable RTTI (see https://github.com/google/snappy/pull/144#issuecomment-968371042)

This means we're stuck with either:

1) Somehow figuring out how to build at least the compression plugin bits of ceph without RTTI? Maybe? I've no idea if this is possible/viable/sensible.

2) Getting patches into the various linux distros to re-enable RTTI for snappy versions >= 1.1.9.

It turns out we had the exact same issue with leveldb - upstream disabled RTTI in leveldb 1.23 (see https://github.com/google/leveldb/releases/tag/1.23), and openSUSE, Fedora, Arch (and I assume others, I only checked those three quickly) have since added patches to the downstream leveldb packages to re-enable RTTI. On the assumption this was going to be the way to go, I've got the appropriate change into openSUSE for snappy: https://build.opensuse.org/request/show/929633. I don't know what the procedure is for other distros offhand, but hopefully this information is helpful for whoever's able to take on that work.

Actions #2

Updated by chris denice about 1 year ago

In the past, the snappy's folks had changed a type in their API without even bumping a major. Now, they refuse to merge a mini change to the building code that simply allows us, and others, to dlopen() the library. It looks like they are open source for advertising reasons only but still code like they work for MS.

Best move: stop using snappy...

Actions #3

Updated by Adam Emerson 11 months ago

  • Status changed from New to Need More Info
  • Assignee set to Adam Emerson

Hello,

Are you using distro packages? It looks like Fedora, Debian, Arch, and OpenSuse are re-enabling RTTI in Snappy downstream, so this issue might be resolved by that.

Actions #4

Updated by Ranjan Ghosh 11 months ago

Yeah, but Ubuntu (one of the most popular distros) just ignores my ticket:

https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1998636

We now have Ceph that's completely broken with default packages for over half a year on Ubuntu :(
It's crazy. They advertise Ceph on Ubuntu (https://ubuntu.com/ceph) and then they leave it broken for months and nobody cares.
I know the Ceph folks can't do much about it but overall this is really a lowlight.
I wonder how I can get out of this situation because it gets particularly ugly to pin the APT package for so long.
Is there a way to easily switch off Bluestore compression? Is there any disadvatange of doing so?

Actions #5

Updated by Adam Emerson 11 months ago

I hadn't even thought to look up Ubuntu, I just thought they'd pull in Debian's change.

(We do our testing against Ubuntu LTS and it looks like Jammy's version of the package is before the RTTI change.)

I wonder if you could file a bug against Snappy directly.

It looks like there are bluestore options for setting whether to use compression and which algorithm to use, but they can all be overridden by per-pool settings.

I'm on the RGW team, so I don't feel qualified to say too much, but in the upstream call when we discussed this, someone on the bluestore team mentioned they had a way to migrate between compression algorithm. So you might want to ask in irc/slack or on the mailing list for more on how.

It looks like lunar lobster has the fixed version of snappy, at least its changelog mentions re-enabling RTTI.

Actions #6

Updated by Adam Emerson 10 months ago

Did you have a chance to see if Lunar Lobster works? From what I can tell it seems that Jammy and Lunar should have a working libnspapy and only Kinetic has the version without RTTI.

Actions #7

Updated by Ranjan Ghosh 10 months ago

@Adam DC949 Emerson: Thanks you so much! Really helpful. I can confirm, Lunar Lobster works. In case anyone is wondering: You don't need to make it work before upgrading to Lunar. If you upgraded to Kinetic and you see that error message, just do another "do-release-upgrade" to upgrade to Lunar and things will magically work again.

Nevertheless, Canonical didn't give a good impression on this at all. No reaction whatsoever; a whole supported(!) distro version for 6(!) months with completely broken Ceph support; leaving users after an upgrade without any explanation why suddenly their Ceph installation doesn't work; finally fixing it silently without any announcement and without any reaction to my bug ticket there => Not very professional, not funny :-(

Actions #8

Updated by Adam Emerson 10 months ago

  • Status changed from Need More Info to Closed

All right. Thanks to everyone who reported for the information.

As a resolution for anyone finding this:

To the best of my knowledge, all major distributions have re-enabled RTTI, so Ceph built against system packages of snappy should work.

The one exception is Ubuntu 22.10 (Kinetic Kudu), which has snappy without RTTI.

For that, the easiest fix would be either upgrading to Lunar Lobster or, as shown in the Ubuntu bug report (https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1998636), installing an older version of snappy.

Actions

Also available in: Atom PDF