Project

General

Profile

Actions

Bug #36289

open

Converting Filestore OSD from leveldb to rocksdb backend on CentOS

Added by David Turner over 5 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is a continuation of [1] this thread from the ML. The only difference we've found is that I'm using CentOS and they're using Ubuntu, but the [2] steps we've used to replace the filestore backend from leveldb to rocksdb has been the same. I end up with a segfault upon restarting the OSD. [3] Here is a full excerpt from the log from starting the OSD to the end of the dump.

I tried modifying filestore_rocksdb_options by removing compression=kNoCompression as well as setting it to compression=kSnappyCompression. Leaving it with kNoCompression or removing it results in the same segfault in the previous log. Setting it to kSnappyCompression resulted in [4] this being logged and the OSD just failing to start instead of segfaulting.

I don't know where to go from here and I'm hoping that using rocksdb will help resolve the backfilling problems I'm seeing which seem to be related to omap compaction. I can test any new steps on an OSD in this same cluster. Thank you for your help.

[1] https://www.spinics.net/lists/ceph-users/msg47920.html

[2] ■ Stop the OSD
■ mv /var/lib/ceph/osd/ceph-/current/omap /var/lib/ceph/osd/ceph-/omap.orig
■ ulimit n 65535
■ ceph-kvstore-tool leveldb /var/lib/ceph/osd/ceph
/omap.orig store-copy /var/lib/ceph/osd/ceph-/current/omap 10000 rocksdb
■ ceph-osdomap-tool --omap-path /var/lib/ceph/osd/ceph-/current/omap --command check
■ sed i s/leveldb/rocksdb/g /var/lib/ceph/osd/ceph/superblock
■ chown ceph.ceph /var/lib/ceph/osd/ceph-/current/omap R
■ rm -rf /var/lib/ceph/osd/ceph
/omap.orig
■ Start the OSD

[3] https://gist.github.com/drakonstein/fa3ac0ad9b2ec1389c957f95e05b79ed

[4] 2018-10-01 17:10:37.134930 7f1415dfcd80 0 set rocksdb option compression = kSnappyCompression
2018-10-01 17:10:37.134986 7f1415dfcd80 -1 rocksdb: Invalid argument: Compression type Snappy is not linked with the binary.
2018-10-01 17:10:37.135004 7f1415dfcd80 -1 filestore(/var/lib/ceph/osd/ceph-1) mount(1723): Error initializing rocksdb :
2018-10-01 17:10:37.135020 7f1415dfcd80 -1 osd.1 0 OSD:init: unable to mount object store
2018-10-01 17:10:37.135029 7f1415dfcd80 -1 ESC[0;31m ** ERROR: osd init failed: (1) Operation not permittedESC[0m

Actions #1

Updated by Greg Farnum over 5 years ago

  • Project changed from Ceph to RADOS
Actions #2

Updated by David Turner over 5 years ago

This seems to be a problem where rocksdb on CentOS doesn't support snappy compression but the ceph-kvstore-tool is converting leveldb to rocksdb with snappy compression. I haven't found a way to convert the backend without snappy compression. On Ubuntu rocksdb seems to support snappy, but migrating to Ubuntu isn't really viable for the scope of this. RHEL might also work for this, but CentOS definitely does not. Is there a way to update rocksdb on CentOS or possibly change ceph-kvstore-tool to allow disabling snappy while it converts the omap db?

Actions #3

Updated by David Turner over 5 years ago

Looking through the ceph/rocksdb repo I don't see how it's possible for rocksdb to be compiled without snappy support. I am seeing blocked requests and OSDs being marked down during omap compaction daily with leveldb. Is there any way we can get some work into making this process viable on CentOS machines? It's a little over my head at the moment.

Actions #4

Updated by Mohammed Naser over 4 years ago

David:

Did you run into a solution for this? We're seeing similar issues but the only possible alternative seems to be rebuilding the entire OSD.

Thanks,
Mohammed

Actions #5

Updated by David Turner over 4 years ago

We had to scrap the idea of changing the backend and went for upgrading the OSDs to Bluestore. Our backfilling issue was so bad that we actually moved most of the data to a new cluster first to reduce how full each OSD was from 60% to 30%. That allowed the backfilling to happen without the constant OSDs flapping.

Actions

Also available in: Atom PDF