Project

General

Profile

Actions

Bug #47446

open

No snap trim progress after removing large snapshots

Added by Patrick Salecker over 3 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):

759622c65fcdf70d4f9a64584a98783ae01635168302f736846d707d878d7395

Crash signature (v2):

Description

I've issues with cleaning up a cluster with a relatively large CephFS (201M objects, 372 TiB), after snapshots where removed. I'm not sure if that's really a bug or if we are using CephFS with snapshots way beyond the limits.

After removing snapshots in large directories (>1 mio files, >10TB size), the snapshot trimming started and overloaded the cluster. All CPU cores of the OSD Servers (8 cores with HT for 16 OSDs) were at 100% and CephFS IO blocked was blocked.

OSDs are logging "removing snap head" about every two minutes when trimming is running. This is the "progress" of one hour of an OSD:

osd.0 pg_epoch: 90915 pg[8.5fs0( v 90915'1555675 (90389'1547511,90915'1555675] local-lis/les=90559/90560 n=188923 ec=70330/68831 lis/c=90559/90559 les/c/f=90560/90560/0 sis=90559) [0,15,10]p0(0) r=0 lpr=90559 crt=90915'1555675 lcod 90915'1555673 mlcod 90915'1555673 active+clean+snaptrim trimq=132] removing snap head
osd.0 pg_epoch: 90926 pg[8.5fs0( v 90926'1555837 (90389'1547711,90926'1555837] local-lis/les=90559/90560 n=188809 ec=70330/68831 lis/c=90559/90559 les/c/f=90560/90560/0 sis=90559) [0,15,10]p0(0) r=0 lpr=90559 crt=90926'1555837 lcod 90925'1555835 mlcod 90925'1555835 active+clean+snaptrim trimq=132] removing snap head

I couldn't find an explenation for the values, maybe you can help me interpreting them. The trimq value didn't decrease in 24h.

That's what I tried:
  • Decreasing osd_max_trimming_pgs / osd_pg_max_concurrent_snap_trims from 2 to 1
    • reduces the OSD server load, but has no noticeable effect on the progress
  • Increasing or decreasing osd_snap_trim_sleep / osd_snap_trim_sleep_hdd
    • had no effect
  • Setting nosnaptrim
    • lets the cluster go back to idle and all PGs to snaptrim_wait.

Sometimes OSDs are reporting crashes when trimming is running, maybe once a day, always a different OSD. You can see one report in the Crash signature field.

For me it would be helpful to get some feedback regarding this behaviour. I want to understand if it is just related to the large amount of objects in the snapshots or if something else is wrong.

Am I right that the trim queue is processed sequentially? So if I delete 10 snapshots (or all) in the same directory, 10 entries get added to the trimq and each object in the directory gets checked ten times.

Of course any ideas of how to recover from this would be appreciated. Let me know if you need any other information.


Related issues 1 (1 open0 closed)

Related to bluestore - Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDsNeed More Info

Actions
Actions #1

Updated by Neha Ojha about 3 years ago

  • Related to Bug #50511: osd: rmdir .snap/snap triggers snaptrim and then crashes various OSDs added
Actions #2

Updated by Greg Farnum almost 3 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (OSD)
Actions #3

Updated by Igor Fedotov almost 3 years ago

  • Project changed from RADOS to bluestore
Actions #4

Updated by Dan van der Ster almost 3 years ago

Independent from any bluestore crashes or bugs, you should expect cephfs snap trimming to take a long time when there are millions of files to trim.

AFAIU the objects are trimmed at the rate "osd_target_transaction_size / osd_snap_trim_sleep". On a hybrid OSD cluster, this is 30/2.0 = 15 objects per second, i.e. just under 1 day per million files.

Actions #5

Updated by Yaarit Hatuka almost 3 years ago

  • Crash signature (v1) updated (diff)

Moving the crash data into a note as 'Crash signature (v1)' should hold the value of 'stack_sig' key:

{
    "archived": "2020-09-13 07:24:17.436885",
    "backtrace": [
        "(()+0x153c0) [0x7f2d1758e3c0]",
        "(pread64()+0x5f) [0x7f2d1758dc0f]",
        "(KernelDevice::read(unsigned long, unsigned long, ceph::buffer::v15_2_0::list*, IOContext*, bool)+0x12f) [0x555a039a738f]",
        "(BlueFS::_read(BlueFS::FileReader*, BlueFS::FileReaderBuffer*, unsigned long, unsigned long, ceph::buffer::v15_2_0::list*, char*)+0x4b4) [0x555a03956c14]",
        "(BlueRocksRandomAccessFile::Prefetch(unsigned long, unsigned long)+0x2f) [0x555a03983c8f]",
        "(rocksdb::RandomAccessFileReader::Prefetch(unsigned long, unsigned long) const+0x53) [0x555a0406f93d]",
        "(rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::InitDataBlock()+0x20c) [0x555a0407dc32]",
        "(rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::FindKeyForward()+0x119) [0x555a0407df85]",
        "(rocksdb::BlockBasedTableIterator<rocksdb::DataBlockIter, rocksdb::Slice>::Next()+0x5c) [0x555a0407c99a]",
        "(rocksdb::IteratorWrapperBase<rocksdb::Slice>::Next()+0x58) [0x555a03fafa4a]",
        "(()+0x16fd311) [0x555a03f8d311]",
        "(rocksdb::IteratorWrapperBase<rocksdb::Slice>::Next()+0x58) [0x555a03fafa4a]",
        "(rocksdb::MergingIterator::Next()+0xfd) [0x555a0408f80b]",
        "(rocksdb::DBIter::FindNextUserEntryInternal(bool, bool)+0xbce) [0x555a03eebc12]",
        "(rocksdb::DBIter::FindNextUserEntry(bool, bool)+0x86) [0x555a03ef455a]",
        "(rocksdb::DBIter::Seek(rocksdb::Slice const&)+0x465) [0x555a03eef633]",
        "(rocksdb::ArenaWrappedDBIter::Seek(rocksdb::Slice const&)+0x2b) [0x555a03ef46d5]",
        "(RocksDBStore::RocksDBWholeSpaceIteratorImpl::lower_bound(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xb7) [0x555a03dec367]",
        "(BlueStore::OmapIteratorImpl::OmapIteratorImpl(boost::intrusive_ptr<BlueStore::Collection>, boost::intrusive_ptr<BlueStore::Onode>, std::shared_ptr<KeyValueDB::IteratorImpl>)+0x1c3) [0x555a03847743]",
        "(BlueStore::get_omap_iterator(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ghobject_t const&)+0x1e1) [0x555a038762a1]",
        "(OSDriver::get_next(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list>*)+0x4b) [0x555a035473eb]",
        "(SnapMapper::get_next_objects_to_trim(snapid_t, unsigned int, std::vector<hobject_t, std::allocator<hobject_t> >*)+0x5fc) [0x555a0354a20c]",
        "(PrimaryLogPG::AwaitAsyncWork::react(PrimaryLogPG::DoSnapWork const&)+0x1c6) [0x555a03487aa6]",
        "(boost::statechart::simple_state<PrimaryLogPG::AwaitAsyncWork, PrimaryLogPG::Trimming, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x109) [0x555a03516d99]",
        "(boost::statechart::state_machine<PrimaryLogPG::SnapTrimmer, PrimaryLogPG::NotTrimming, std::allocator<boost::statechart::none>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x74) [0x555a034de554]",
        "(PrimaryLogPG::snap_trimmer(unsigned int)+0xe8) [0x555a03441f98]",
        "(ceph::osd::scheduler::PGSnapTrim::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x1f) [0x555a035b7b6f]",
        "(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x4fb) [0x555a03345c9b]",
        "(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x403) [0x555a03a09f23]",
        "(ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x555a03a0cd24]",
        "(()+0x9609) [0x7f2d17582609]",
        "(clone()+0x43) [0x7f2d174a9103]" 
    ],
    "ceph_version": "15.2.3",
    "crash_id": "2020-09-12T07:06:01.592494Z_b980f2a9-4b77-4c26-a4e8-50d663747ea9",
    "entity_name": "osd.4",
    "os_id": "ubuntu",
    "os_name": "Ubuntu",
    "os_version": "20.04.1 LTS (Focal Fossa)",
    "os_version_id": "20.04",
    "process_name": "ceph-osd",
    "stack_sig": "759622c65fcdf70d4f9a64584a98783ae01635168302f736846d707d878d7395",
    "timestamp": "2020-09-12T07:06:01.592494Z",
    "utsname_hostname": "ceph-osd01",
    "utsname_machine": "x86_64",
    "utsname_release": "5.4.0-45-generic",
    "utsname_sysname": "Linux",
    "utsname_version": "#49-Ubuntu SMP Wed Aug 26 13:38:52 UTC 2020" 
}
Actions

Also available in: Atom PDF