Project

General

Profile

Bug #40831

compression segfaults with zstd 1.3.8 and incompatibilities with zstd 1.4.0

Added by Thore Bödecker 3 months ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
Due date:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature:

Description

Hey,

I'm currently working on packaging ceph 14.2.1 for Arch Linux (still some kinks to work out, once that is done bumping to the freshly released 14.2.2 won't be a big deal hopefully).

During my countless attempts of building ceph with varying cmake options I have discovered some issues with the zstd compression stuff.
If I'm not mistaken the ceph 14.2.1 tarball comes with a bundled zstd 1.3.2 in src/zstd.
The build process for Arch Linux is usually done in a chroot from a minimal base system, that gets freshly cloned before building.
After the clone all listed build dependencies (makedeps) listed in the PKGBUILD are installed the chroot so that is properly prepared with all requirements.

Some time ago during the zstd 1.3.8 upstream release I had been seeing segfaults in the unittest_compression test.
These segfaults present themselves when zstd version 1.3.8 is installed in the chroot from the official repository as packaged by Arch Linux.

I have now found some time to debug this further and manually built Arch Linux packages for various zstd packages with the following results:
(all findings are based upon the 14.2.1 tarball)

zstd 1.3.3: works
zstd 1.3.4: works
zstd 1.3.5: works
zstd 1.3.6: works
zstd 1.3.7: works
zstd 1.3.8: segfaults
zstd 1.4.0: fails with unexpected return codes and results (EINVAL, probably due to a changed zstd API)

While I was unable to debug why zstd 1.3.8 segfaults (might even be a buggy release) I managed to stick together a patch for zstd 1.4.0 compatibility and it would be great if someone could give some feedback on that.

Also I'm not quite sure how this happened at all when there is a bundled/vendored zstd 1.3.2 inside the ceph 14.2.1 tarball. It seems the #include directives for zstd are not properly pointed to the vendored zstd for all components.

This leads me to a request for a WITH_SYSTEM_ZSTD cmake option, like they already exist for boost, npm, gtest. When implemented like the other WITH_SYSTEM_* options it should hopefully avoid unintended interaction with the build environment of OS and properly use either the bundled/vendored zstd or the system one only.

In order to test the zstd 1.4.0 release a bit further and make sure that it is used everywhere throghout the ceph build, I went ahead and replaced the src/zstd directory of the 14.2.1 tarball with an upstream zstd 1.4.0 tarball, together with having zstd 1.4.0 installed from the Arch Linux repositories.
With my patch the build completes fine and passes the unittest_compression test successfully, leaving me to believe it should work.
Was my procedure on validating this correct? (Feedback welcome)

I have attached the ctest log outputs for zstd 1.3.8 and zstd 1.4.0 builds together with my zstd 1.4.0 compatibility patch.

Let me know if and what I did wrong, if you're missing any further details or whatever.

Cheers,
Thore

use-system-zstd-and-fix-zstd-1.4.0-compatbility.patch View (1.37 KB) Thore Bödecker, 07/19/2019 11:17 AM

unittest_compression_zstd-1.3.8.log.gz (3.75 KB) Thore Bödecker, 07/19/2019 11:17 AM

unittest_compression_zstd-1.4.0.log.gz (95.9 KB) Thore Bödecker, 07/19/2019 11:17 AM

History

#1 Updated by Brad Hubbard 3 months ago

  • Assignee set to Brad Hubbard
  • Source set to Community (dev)

#2 Updated by Patrick Donnelly 3 months ago

  • Project changed from Ceph to RADOS
  • Target version set to v15.0.0
  • Start date deleted (07/19/2019)
  • Backport set to nautilus

Also available in: Atom PDF