Bug #21332: OSD Caught signal with StupidAllocator on 12.2.0 - bluestore - Ceph

Actions

Copy link

Bug #21332

closed

OSD Caught signal with StupidAllocator on 12.2.0

Added by Eric Eastman over 6 years ago. Updated over 6 years ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

This was originally posted to to the ceph email list: https://www.spinics.net/lists/ceph-users/msg38646.html

I am seeing OOM issues with some of my OSD nodes that I am testing with Bluestore on 12.2.0, so I decided to try the StupidAllocator to see if it has a smaller memory footprint, by setting the following in my ceph.conf:

bluefs_allocator = stupid
bluestore_cache_size_hdd = 1073741824
bluestore_cache_size_ssd = 1073741824

With these settings I am no longer seeing OOM errors, but on the node with these setting, I have seen multiple Aborted messages in my log files:

grep  Abort *log
ceph-osd.10.log:2017-09-09 12:39:28.573034 7f2816f45700 -1 *** Caught signal (Aborted) **
ceph-osd.10.log:     0> 2017-09-09 12:39:28.573034 7f2816f45700 -1 *** Caught signal (Aborted) **
ceph-osd.11.log:2017-09-09 11:39:16.835793 7fdcf6b08700 -1 *** Caught signal (Aborted) **
ceph-osd.11.log:     0> 2017-09-09 11:39:16.835793 7fdcf6b08700 -1 *** Caught signal (Aborted) **
ceph-osd.3.log:2017-09-09 07:10:58.565465 7fa2e96c8700 -1 *** Caught signal (Aborted) **
ceph-osd.3.log:2017-09-09 07:49:56.256899 7f89edf90700 -1 *** Caught signal (Aborted) **
ceph-osd.3.log:     0> 2017-09-09 07:49:56.256899 7f89edf90700 -1 *** Caught signal (Aborted) **
ceph-osd.3.log:2017-09-09 08:13:16.919887 7f82f315e700 -1 *** Caught signal (Aborted) **
ceph-osd.7.log:2017-09-09 09:19:17.281950 7f77824cf700 -1 *** Caught signal (Aborted) **
ceph-osd.7.log:     0> 2017-09-09 09:19:17.281950 7f77824cf700 -1 *** Caught signal (Aborted) **

All log files can be downloaded from: ftp://ftp.keepertech.com/outgoing/eric/ceph_logs/osd-logs-9-Sep-17.tar.gz

A couple of examples of the Aborts are:

2017-09-09 12:39:27.044074 7f27f5f20700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1504975167035909, "job": 86, "event": "flush_started", "num_memtables": 1, "num_entries": 1015543, "num_deletes": 345553, "memory_usage": 260049176}
2017-09-09 12:39:27.044088 7f27f5f20700  4 rocksdb: [/build/ceph-12.2.0/src/rocksdb/db/flush_job.cc:293] [default] [JOB 86] Level-0 flush table #1825: started
2017-09-09 12:39:28.234651 7f27fff34700 -1 osd.10 pg_epoch: 3521 pg[1.3c7( v 3521'372186 (3456'369135,3521'372186] local-lis/les=3488/3490 n=2842 ec=578/66 lis/c 3488/3488 les/c/f 3490/3500/0 3488/3488/3477) [10,8,16] r=0 lpr=3488 crt=3521'372186 lcod 3521'372184 mlcod 3521'372184 active+clean+snaptrim snaptrimq=[111~2,115~2,13a~1,13c~3]] removing snap head
2017-09-09 12:39:28.573034 7f2816f45700 -1 *** Caught signal (Aborted) **
 in thread 7f2816f45700 thread_name:msgr-worker-2

 ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
 1: (()+0xa562f4) [0x5634e14882f4]
 2: (()+0x11390) [0x7f281b2c5390]
 3: (gsignal()+0x38) [0x7f281a261428]
 4: (abort()+0x16a) [0x7f281a26302a]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x16d) [0x7f281aba384d]
 6: (()+0x8d6b6) [0x7f281aba16b6]
 7: (()+0x8d701) [0x7f281aba1701]
 8: (()+0xb8d38) [0x7f281abccd38]
 9: (()+0x76ba) [0x7f281b2bb6ba]
 10: (clone()+0x6d) [0x7f281a33282d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
-10000> 2017-09-09 12:39:05.878006 7f2817746700  1 -- 172.16.2.133:6804/1327479 <== osd.2 172.16.2.131:6800/1710 37506 ==== osd_repop(mds.0.19:101159707 1.2f1 e3521/3477) v2 ==== 998+0+46 (52256346 0 1629833233) 0x56359eb29000 con 0x563510c02000
 -9999> 2017-09-09 12:39:05.878065 7f2816f45700  1 -- 10.15.2.133:6805/327479 <== mds.0 10.15.2.123:6800/2942775562 55580 ==== osd_op(mds.0.19:101159714 1.ec 1.ffad68ec (undecoded) ondisk+write+known_if_redirected+full_force e3521) v8 ==== 305+0+366 (2883828331 0 2609552142) 0x56355d9eb0c0 con 0x56355f455000

Second example:

2017-09-09 07:10:58.135527 7fa2d56a0700  4 rocksdb: [/build/ceph-12.2.0/src/rocksdb/db/flush_job.cc:264] [default] [JOB 10] Flushing memtable with next log file: 2773

2017-09-09 07:10:58.262058 7fa2d56a0700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1504955458135538, "job": 10, "event": "flush_started", "num_memtables": 1, "num_entries": 935059, "num_deletes": 175946, "memory_usage": 260049888}
2017-09-09 07:10:58.262077 7fa2d56a0700  4 rocksdb: [/build/ceph-12.2.0/src/rocksdb/db/flush_job.cc:293] [default] [JOB 10] Level-0 flush table #2774: started
2017-09-09 07:10:58.565465 7fa2e96c8700 -1 *** Caught signal (Aborted) **
 in thread 7fa2e96c8700 thread_name:bstore_kv_sync

 ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
 1: (()+0xa562f4) [0x5579585362f4]
 2: (()+0x11390) [0x7fa2faa45390]
 3: (gsignal()+0x38) [0x7fa2f99e1428]
 4: (abort()+0x16a) [0x7fa2f99e302a]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x16d) [0x7fa2fa32384d]
 6: (()+0x8d6b6) [0x7fa2fa3216b6]
 7: (()+0x8d701) [0x7fa2fa321701]
 8: (()+0x8d919) [0x7fa2fa321919]
 9: (()+0x1230f) [0x7fa2fb60b30f]
 10: (operator new[](unsigned long)+0x4e7) [0x7fa2fb62f4b7]
 11: (rocksdb::Arena::AllocateNewBlock(unsigned long)+0x70) [0x557958939150]
 12: (rocksdb::Arena::AllocateFallback(unsigned long, bool)+0x45) [0x5579589392d5]
 13: (rocksdb::Arena::AllocateAligned(unsigned long, unsigned long, rocksdb::Logger*)+0x100) [0x557958939460]
 14: (rocksdb::ConcurrentArena::AllocateAligned(unsigned long, unsigned long, rocksdb::Logger*)+0x175) [0x5579588a80d5]
 15: (()+0xe01d03) [0x5579588e1d03]
 16: (()+0xe024dd) [0x5579588e24dd]
 17: (rocksdb::MemTable::Add(unsigned long, rocksdb::ValueType, rocksdb::Slice const&, rocksdb::Slice const&, bool, rocksdb::MemTablePostProcessInfo*)+0x109) [0x5579588a3629]
 18: (rocksdb::MemTableInserter::PutCF(unsigned int, rocksdb::Slice const&, rocksdb::Slice const&)+0x39c) [0x5579588db6cc]
 19: (rocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const+0x5b7) [0x5579588d4fa7]
 20: (rocksdb::WriteBatchInternal::InsertInto(rocksdb::autovector<rocksdb::WriteThread::Writer*, 8ul> const&, unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, bool, unsigned long, rocksdb::DB*, bool)+0x14b) [0x5579588d8dcb]
 21: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool)+0x14a7) [0x55795899e2c7]
 22: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x2a) [0x55795899ed2a]
 23: (RocksDBStore::submit_transaction(std::shared_ptr<KeyValueDB::TransactionImpl>)+0xaf) [0x55795847751f]
 24: (BlueStore::_kv_sync_thread()+0x23dc) [0x55795841150c]
 25: (BlueStore::KVSyncThread::entry()+0xd) [0x55795845453d]
 26: (()+0x76ba) [0x7fa2faa3b6ba]
 27: (clone()+0x6d) [0x7fa2f9ab282d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

System info:

ceph -v   
ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)   

OS Ubuntu 16.04 using 4.13.0 kernel   
uname -a   
Linux ede-c2-osd04 4.13.0-041300-generic #201709031731 SMP Sun Sep 3 21:33:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux   

free   
              total        used        free      shared  buff/cache   available   
Mem:       16427036    15095176     1190364          44      141496      101164   
Swap:        999420      737216      262204

lsblk
NAME           MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
cciss/c0d5     104:80   0 68.3G  0 disk 
|-cciss/c0d5p2 104:82   0 68.2G  0 part 
`-cciss/c0d5p1 104:81   0  100M  0 part /var/lib/ceph/osd/ceph-10
cciss/c0d3     104:48   0 68.3G  0 disk 
|-cciss/c0d3p1 104:49   0  100M  0 part /var/lib/ceph/osd/ceph-7
`-cciss/c0d3p2 104:50   0 68.2G  0 part 
cciss/c0d1     104:16   0 68.3G  0 disk 
|-cciss/c0d1p1 104:17   0  100M  0 part /var/lib/ceph/osd/ceph-3
`-cciss/c0d1p2 104:18   0 68.2G  0 part 
sr0             11:0    1 1024M  0 rom  
cciss/c0d4     104:64   0 68.3G  0 disk 
|-cciss/c0d4p1 104:65   0  100M  0 part /var/lib/ceph/osd/ceph-11
`-cciss/c0d4p2 104:66   0 68.2G  0 part 
cciss/c0d2     104:32   0 68.3G  0 disk 
|-cciss/c0d2p2 104:34   0 68.2G  0 part 
`-cciss/c0d2p1 104:33   0  100M  0 part /var/lib/ceph/osd/ceph-9
cciss/c0d0     104:0    0 68.3G  0 disk 
|-cciss/c0d0p5 104:5    0  976M  0 part [SWAP]
|-cciss/c0d0p1 104:1    0 19.1G  0 part /
|-cciss/c0d0p6 104:6    0 48.3G  0 part /var
`-cciss/c0d0p2 104:2    0    1K  0 part 
root@ede-c2-osd04:~# 

ceph -s
  cluster:
    id:     6d8aae1e-1125-11e5-a708-001b78e265be
    health: HEALTH_WARN
            1 MDSs behind on trimming

  services:
    mon: 3 daemons, quorum ede-c2-mon01,ede-c2-mon02,ede-c2-mon03
    mgr: ede-c2-mon03(active), standbys: ede-c2-mon01, ede-c2-mon02
    mds: cephfs-1/1/1 up  {0=ede-c2-mon03=up:active}, 2 up:standby
    osd: 19 osds: 19 up, 19 in

  data:
    pools:   4 pools, 1792 pgs
    objects: 1628k objects, 221 GB
    usage:   719 GB used, 577 GB / 1296 GB avail
    pgs:     1792 active+clean

  io:
    client:   54130 B/s rd, 4254 kB/s wr, 24 op/s rd, 718 op/s wr

ceph health detail
HEALTH_WARN 1 MDSs behind on trimming
MDS_TRIM 1 MDSs behind on trimming
    mdsede-c2-mon03(mds.0): Behind on trimming (49174/30)max_segments: 30, num_segments: 49174

Work load: 3 cephfs clients, using kernel mount cephfs (kernel 4.13.0), all creating, checking, and deleting files. Cephfs snapshot taken hourly, with cron based process that deletes snapshot as needed to keep the file system less then 70% full.
NOTE: CephFS snapshots are enabled.

Actions

Copy link

Updated by Vy Nguyen Tan over 6 years ago

Hello,

I'm running Ceph v12.2.1 and I got the same error when I run FIO for bechmark VM.

FIO command: fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

The error message:
/var/log/ceph/ceph-osd.10.log:2017-10-12 12:20:20.636618 7fce53bff700 -1 * Caught signal (Aborted) *
/var/log/ceph/ceph-osd.10.log: 0> 2017-10-12 12:20:20.636618 7fce53bff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.10.log:2017-10-12 12:23:42.006008 7fddc47ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.10.log: 0> 2017-10-12 12:23:42.006008 7fddc47ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.10.log:2017-10-12 12:31:20.239551 7f1f9afe2700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.10.log: 0> 2017-10-12 12:31:20.239551 7f1f9afe2700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.10.log:2017-10-12 13:02:59.459229 7f09fa9ea700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.10.log: 0> 2017-10-12 13:02:59.459229 7f09fa9ea700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.10.log:2017-10-12 13:22:15.041223 7fd59b7ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.10.log: 0> 2017-10-12 13:22:15.041223 7fd59b7ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.11.log:2017-10-12 12:20:09.965439 7fc321bff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.11.log: 0> 2017-10-12 12:20:09.965439 7fc321bff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.11.log:2017-10-12 19:45:11.961654 7fb420bff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.11.log: 0> 2017-10-12 19:45:11.961654 7fb420bff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.11.log:2017-10-12 21:13:22.619571 7f60e83ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.11.log: 0> 2017-10-12 21:13:22.619571 7f60e83ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.12.log:2017-10-12 12:18:54.097896 7f8925fff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.12.log: 0> 2017-10-12 12:18:54.097896 7f8925fff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.12.log:2017-10-12 13:28:49.748914 7fe073fff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.12.log: 0> 2017-10-12 13:28:49.748914 7fe073fff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.12.log:2017-10-12 20:09:04.841494 7f354f3ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.12.log: 0> 2017-10-12 20:09:04.841494 7f354f3ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.12.log:2017-10-12 21:14:34.905241 7fe9387ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.12.log: 0> 2017-10-12 21:14:34.905241 7fe9387ff700 -1 * Caught signal (Aborted)
/var/log/ceph/ceph-osd.13.log:2017-10-12 12:23:30.327590 7ff64967b700 -1 * Caught signal (Aborted) *

more error log: ==== 1034+0+4847 (2071825508 0 960246930) 0x564ba3f2d000 con 0x564b9ee55000
4> 2017-10-12 15:30:42.135371 7f633f3eb700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 2057'3359, trimmed: , trimmed_dups: , clear_divergent_priors: 0
-3> 2017-10-12 15:30:42.138388 7f634b7fe700 1 - 10.10.155.3:6809/3099 --> 10.10.155.1:6801/38411 -- osd_repop_reply(client.94249.1:1857445 1.41 e2057/1418 ondisk, result = 0) v2 -- 0x564b99afca00 con 0
-2> 2017-10-12 15:30:42.160969 7f6322bff700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1507797042160939, "cf_name": "default", "job": 15, "event": "table_file_creation", "file_number": 248, "file_size": 1690356, "table_properties": {"data_size": 1621284, "index_size": 24865, "filter_size": 43226, "raw_key_size": 817387, "raw_average_key_size": 51, "raw_value_size": 1349360, "raw_average_value_size": 85, "num_data_blocks": 400, "num_entries": 15793, "filter_policy_name": "rocksdb.BuiltinBloomFilter", "kDeletedKeys": "6718", "kMergeOperands": "67"}}
-1> 2017-10-12 15:30:42.160989 7f6322bff700 4 rocksdb: [/home/builder/source/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319] [default] [JOB 15] Level-0 flush table #248: 1690356 bytes OK
0> 2017-10-12 15:30:42.172458 7f6322bff700 -1 ** Caught signal (Aborted) *
in thread 7f6322bff700 thread_name:rocksdb:bg0

ceph version 12.2.1 (fc129ad90a65dc0b419412e77cb85ac230da42a6) luminous (stable)
 1: (()+0xa0c554) [0x564b8d219554]
 2: (()+0x110c0) [0x7f635f5690c0]
 3: (gsignal()+0xcf) [0x7f635e530fcf]
 4: (abort()+0x16a) [0x7f635e5323fa]
 5: (()+0x2be37) [0x7f635e529e37]
 6: (()+0x2bee2) [0x7f635e529ee2]
 7: (()+0xed4654) [0x564b8d6e1654]
 8: (rocksdb::MemTableListVersion::Unref(rocksdb::autovector&lt;rocksdb::MemTable*, 8ul&gt;*)+0x50) [0x564b8d6e16b0]
 9: (rocksdb::SuperVersion::Cleanup()+0x3d) [0x564b8d66453d]
 10: (rocksdb::ColumnFamilyData::InstallSuperVersion(rocksdb::SuperVersion*, rocksdb::InstrumentedMutex*, rocksdb::MutableCFOptions const&)+0x18e) [0x564b8d664bfe]
 11: (rocksdb::DBImpl::InstallSuperVersionAndScheduleWork(rocksdb::ColumnFamilyData*, rocksdb::SuperVersion*, rocksdb::MutableCFOptions const&)+0x5c) [0x564b8d699dec]
 12: (rocksdb::DBImpl::InstallSuperVersionAndScheduleWorkWrapper(rocksdb::ColumnFamilyData*, rocksdb::JobContext*, rocksdb::MutableCFOptions const&)+0x4b) [0x564b8d69d9fb]
 13: (rocksdb::DBImpl::FlushMemTableToOutputFile(rocksdb::ColumnFamilyData*, rocksdb::MutableCFOptions const&, bool*, rocksdb::JobContext*, rocksdb::LogBuffer*)+0x906) [0x564b8d69e396]

my configure ceph.conf:
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.155.0/24
fsid = 1a405464-18ab-4ccb-8983-97798cd313eb
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
public network = 10.10.155.0/24
cluster network = 10.10.155.0/24
bluefs_allocator = stupid
bluestore_cache_size_hdd = 1073741824
bluestore_cache_size_ssd = 1073741824

[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.proxmox01]
host = proxmox01
mon addr = 10.10.155.1:6789

[mon.proxmox03]
host = proxmox03
mon addr = 10.10.155.3:6789

[mon.proxmox02]
host = proxmox02
mon addr = 10.10.155.2:6789

Actions

Copy link