Project

General

Profile

Bug #52900

segfault on FIPS enabled server as result of EVP_md5 disabled in openssl

Added by Mark Kogan 12 months ago. Updated 15 days ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
backport_processed
Backport:
pacific octopus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

reproduces in a vstart environment on

  $ cat /etc/redhat-release 
  Red Hat Enterprise Linux release 8.4 (Ootpa)

with fips enabled:
  $ sudo fips-mode-setup --enable
  $ sudo systemctl reboot

  $ sysctl crypto.fips_enabled
  crypto.fips_enabled = 1

with the following flow:

./bin/radosgw-admin realm create --rgw-realm=gold
./bin/radosgw-admin zonegroup create --master --rgw-realm=gold --rgw-zonegroup=us --endpoints=http://127.0.0.1:8000
./bin/radosgw-admin zone create --master --endpoints=http://127.0.0.1:8000 --rgw-realm=gold --rgw-zonegroup=us --rgw-zone=us-east

# segfaults every time:
./bin/radosgw-admin period update --commit --rgw-realm=gold --rgw-zonegroup=us --rgw-zone=us-east
...
    -6> 2021-10-11T09:36:12.361+0000 7fa039ffb700 10 monclient: _finish_command 4 = mon:22 unparseable JSON {"prefix": "osd pool set", "pool": "us-east.rgw.meta", "var": "recovery_priority": "5"}
    -5> 2021-10-11T09:36:12.361+0000 7fa102870c00  5 note: GC not initialized
    -4> 2021-10-11T09:36:12.362+0000 7fa102870c00  5 asok(0x55d0eaef2350) register_command sync trace show hook 0x55d0eaf148e0
    -3> 2021-10-11T09:36:12.362+0000 7fa102870c00  5 asok(0x55d0eaef2350) register_command sync trace history hook 0x55d0eaf148e0
    -2> 2021-10-11T09:36:12.362+0000 7fa102870c00  5 asok(0x55d0eaef2350) register_command sync trace active hook 0x55d0eaf148e0
    -1> 2021-10-11T09:36:12.362+0000 7fa102870c00  5 asok(0x55d0eaef2350) register_command sync trace active_short hook 0x55d0eaf148e0
    0> 2021-10-11T09:36:12.367+0000 7fa102870c00 -1 *** Caught signal (Segmentation fault) **
    in thread 7fa102870c00 thread_name:radosgw-admin

    ceph version 16.2.0-325-g0e34bb74700 (0e34bb74700060ebfaa22d99b7d2cdc037b28a57) pacific (stable)
    1: /lib64/libpthread.so.0(+0x12b20) [0x7fa0f7b1fb20]
    NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

callstack in gdb:

multi-thre Thread 0x7ffff7fd82 In: ceph::crypto::ssl::OpenSSLDigest::Update                                                                                                                L201  PC: 0x7ffff4db8cac 
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff4db8cac in ceph::crypto::ssl::OpenSSLDigest::Update (this=0x7ffffffd80c0, input=0x2ba2f70 "a122d7e2-650d-4488-b232-2fb1782e1342", length=36) at ../src/common/ceph_crypto.cc:201
#2  0x0000000001f95e9f in gen_short_zone_id (zone_id="a122d7e2-650d-4488-b232-2fb1782e1342") at ../src/rgw/rgw_zone.cc:1908
#3  0x0000000001f8b18a in RGWPeriodMap::update (this=0x7ffffffd9008, zonegroup=..., cct=0x29dd4c0) at ../src/rgw/rgw_zone.cc:1948
#4  0x0000000001f8c9ce in RGWPeriod::update (this=0x7ffffffd8fa8, dpp=0x2948d88 <dpp()::global_dpp>, y=...) at ../src/rgw/rgw_zone.cc:1383
#5  0x00000000012b02e3 in update_period (realm_id="", realm_name="gold", period_id="", period_epoch="", commit=true, remote="", url="", opt_region=std::optional<std::string> [no contained value], access="",
    secret="", formatter=0x2b8e750, force=false) at ../src/rgw/rgw_admin.cc:1776
#6  0x0000000001287f0f in main (argc=7, argv=0x7fffffffdec8) at ../src/rgw/rgw_admin.cc:5933
(gdb) f 1
#1  0x00007ffff4db8cac in ceph::crypto::ssl::OpenSSLDigest::Update (this=0x7ffffffd80c0, input=0x2ba2f70 "a122d7e2-650d-4488-b232-2fb1782e1342", length=36) at ../src/common/ceph_crypto.cc:201

src/rgw/rgw_zone.cc is using 'EVP_md5' which is forbidden in FIPS


...
    static uint32_t gen_short_zone_id(const std::string zone_id)
    {
    unsigned char md5[CEPH_CRYPTO_MD5_DIGESTSIZE];
    MD5 hash;
...


Related issues

Duplicated by rgw - Bug #52799: Segmentation Fault in radosgw-admin period update --commit Duplicate
Copied to rgw - Backport #53007: pacific: segfault on FIPS enabled server as result of EVP_md5 disabled in openssl Resolved
Copied to rgw - Backport #53008: octopus: segfault on FIPS enabled server as result of EVP_md5 disabled in openssl Resolved

History

#1 Updated by Mark Kogan 12 months ago

  • Pull request ID set to 43503

created a PR (https://github.com/ceph/ceph/pull/43503) that activates the FIPS overriding mechanism in 2 locations (radosgw-admin & radosgw)
following which it is possible to start a multisite environment with mstart/mrun/mrgw and perform s3cmd put/get/list/delete ops without the segfault above.

from a quick browse of the code, there seem to be another 29 locations where this is possibly necessary

git grep CEPH_CRYPTO_MD5_DIGESTSIZE | grep "E\]" | wc -l
29

#2 Updated by Casey Bodley 12 months ago

  • Duplicated by Bug #52799: Segmentation Fault in radosgw-admin period update --commit added

#3 Updated by Casey Bodley 11 months ago

  • Status changed from In Progress to Pending Backport
  • Backport set to pacific octopus

#4 Updated by Backport Bot 11 months ago

  • Copied to Backport #53007: pacific: segfault on FIPS enabled server as result of EVP_md5 disabled in openssl added

#5 Updated by Backport Bot 11 months ago

  • Copied to Backport #53008: octopus: segfault on FIPS enabled server as result of EVP_md5 disabled in openssl added

#6 Updated by Mark Kogan 8 months ago

backport PRs have been created and noted in above tracker issues

#7 Updated by Backport Bot about 2 months ago

  • Tags set to backport_processed

#8 Updated by Peter Razumovsky 15 days ago

Ceph v15.2.17 with pull merged.

Still observing the issues running RGW on FIPS enabled nodes:

epoch 3
fsid b829328d-bbeb-49dd-96e0-5391a7edcf68
last_changed 2022-09-12T00:34:16.272186+0000
created 2022-09-12T00:31:57.273110+0000
min_mon_release 15 (octopus)
election_strategy: 1
0: [v2:192.168.1.54:3300/0,v1:192.168.1.54:6789/0] mon.a
1: [v2:192.168.1.55:3300/0,v1:192.168.1.55:6789/0] mon.b
2: [v2:192.168.1.51:3300/0,v1:192.168.1.51:6789/0] mon.c

   -16> 2022-09-13T13:21:50.877+0000 7f654b142380  5 monclient: authenticate success, global_id 408208
   -15> 2022-09-13T13:21:50.877+0000 7f654b142380 10 monclient: _renew_subs
   -14> 2022-09-13T13:21:50.877+0000 7f654b142380 10 monclient: _send_mon_message to mon.b at v2:192.168.1.55:3300/0
   -13> 2022-09-13T13:21:50.877+0000 7f650bfe7700  4 set_mon_vals no callback set
   -12> 2022-09-13T13:21:50.877+0000 7f654b142380 10 monclient: _renew_subs
   -11> 2022-09-13T13:21:50.877+0000 7f654b142380 10 monclient: _send_mon_message to mon.b at v2:192.168.1.55:3300/0
   -10> 2022-09-13T13:21:50.877+0000 7f654b142380  1 librados: init done
    -9> 2022-09-13T13:21:50.877+0000 7f654b142380  5 asok(0x5574de39d760) register_command cr dump hook 0x5574de50b718
    -8> 2022-09-13T13:21:50.877+0000 7f6535e4c700 10 monclient: get_auth_request con 0x5574de510b20 auth_method 0
    -7> 2022-09-13T13:21:50.877+0000 7f650a7e4700  4 mgrc handle_mgr_map Got map version 49
    -6> 2022-09-13T13:21:50.877+0000 7f650a7e4700  4 mgrc handle_mgr_map Active mgr is now [v2:192.168.1.55:6808/3841094010,v1:192.168.1.55:6809/3841094010]
    -5> 2022-09-13T13:21:50.877+0000 7f650a7e4700  4 mgrc reconnect Starting new session with [v2:192.168.1.55:6808/3841094010,v1:192.168.1.55:6809/3841094010]
    -4> 2022-09-13T13:21:50.877+0000 7f6536e4e700 10 monclient: get_auth_request con 0x7f648802af70 auth_method 0
    -3> 2022-09-13T13:21:50.881+0000 7f653664d700 10 monclient: get_auth_request con 0x5574de520f70 auth_method 0
    -2> 2022-09-13T13:21:50.885+0000 7f6535e4c700 10 monclient: get_auth_request con 0x7f6528034580 auth_method 0
    -1> 2022-09-13T13:21:50.889+0000 7f6536e4e700 10 monclient: get_auth_request con 0x7f6520009b80 auth_method 0
     0> 2022-09-13T13:21:50.893+0000 7f654b142380 -1 *** Caught signal (Segmentation fault) **
 in thread 7f654b142380 thread_name:radosgw-admin

 ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)
 1: /lib64/libpthread.so.0(+0x12ce0) [0x7f653f2ffce0]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 rbd_pwl
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 immutable_obj_cache
   0/ 5 client
   1/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 0 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 1 reserver
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 rgw_sync
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 compressor
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   4/ 5 memdb
   1/ 5 fuse
   2/ 5 mgr
   1/ 5 mgrc
   1/ 5 dpdk
   1/ 5 eventtrace
   1/ 5 prioritycache
   0/ 5 test
   0/ 5 cephfs_mirror
   0/ 5 cephsqlite
  -2/-2 (syslog threshold)
  99/99 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
  140071944472320 / ms_dispatch
  140071969650432 / io_context_pool
  140072556943104 / ms_dispatch
  140072573728512 / io_context_pool
  140072655824640 / io_context_pool
  140072664217344 / fn_anonymous
  140072672610048 / msgr-worker-2
  140072681002752 / msgr-worker-1
  140072689395456 / msgr-worker-0
  140073028035456 / radosgw-admin
  max_recent       500
  max_new          500
  log_file /var/lib/ceph/crash/2022-09-13T13:21:50.898442Z_98e1b482-99b6-43c5-b5b3-947d03fe191f/log
--- end dump of recent events ---

Also available in: Atom PDF