Actions
Bug #15659
closedBluestore OSD crash after high amount of writes and object movement/deletion
Status:
Won't Fix
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
Hey,
It looks like bluestore OSDs will crash when we have many objects in it and do operations with them.
We build our own ceph debs. If needed I can provide them.
test setup, all Debian Jessie with 10G interfaces:
5 servers 12x 4TB SATA.
3 mons.
9 servers running ceph bench.
- bluestore crush weights are set to 3.63148
- 9 servers start writing to cluster during the night. rados --name client.benchmarks bench 61200 -p benchmarks write -t 100 --run-name $(hostname -s)-02 --no-cleanup | tee write-02-large.txt
- Writes stop in the morning due to full OSD. I ^C "ceph bench" processes. CLuster is in clean state (all OSDs are up, There are near full and full alerts in health). There are about 14mil of objects
- Do some osd reweighting. Notice that load on OSD servers is increased. it looks like "ceph osd df" does not get updated to show changed usage due to rebalancing (cannot confirm, do not have logs at hand)
- Decide to delete ~1mil objects to free some space. Doing it all on mon01.
- "rados -p benchmarks ls| head -n 1000000|awk '{print "rados -p benchmarks rm "$1}'|sh" is slow. So decided to run "rados -p benchmarks ls| head -n 1000000| xargs -n 1 -P 0 rados -p benchmarks rm"
- From output it looks "faster". Server becomes very slow. I restart it. From another mon I see that 14 OSDs are down.
- OSDs do not start.
snipet from osd.22
2016-04-29 07:35:08.437215 7f7a0e475700 0 -- 10.27.6.32:0/53671 >> 10.27.5.9:6827/81830 pipe(0x7f7a5e0b5400 sd=133 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7a76beeb80).fault 2016-04-29 07:35:08.438713 7f7a0e576700 0 -- 10.27.6.32:0/53671 >> 10.27.5.9:6826/81830 pipe(0x7f7a61028000 sd=136 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7a6f0e8480).fault 2016-04-29 07:35:08.508250 7f7a1c353700 0 -- 10.27.6.32:0/53671 >> 10.27.29.21:6814/100587 pipe(0x7f7a5e740000 sd=137 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7a5e254c00).fault 2016-04-29 07:35:08.508300 7f7a1da6a700 0 -- 10.27.6.32:0/53671 >> 10.27.29.21:6815/100587 pipe(0x7f7a620ec000 sd=140 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7a6f221f80).fault 2016-04-29 07:35:10.583748 7f7a1c757700 0 -- 10.27.6.32:6817/53671 >> 10.27.29.21:6829/104065 pipe(0x7f7a66c5f400 sd=141 :6817 s=0 pgs=0 cs=0 l=0 c=0x7f7a6e75cc00).accept connect_seq 5 vs existing 5 state standby 2016-04-29 07:35:10.584021 7f7a1c757700 0 -- 10.27.6.32:6817/53671 >> 10.27.29.21:6829/104065 pipe(0x7f7a66c5f400 sd=141 :6817 s=0 pgs=0 cs=0 l=0 c=0x7f7a6e75cc00).accept connect_seq 6 vs existing 5 state standby 2016-04-29 07:35:10.588915 7f7a422fa700 -1 freelist release bad release 881234739200~4194304 overlaps with 881234739200~4194304 2016-04-29 07:35:10.591998 7f7a422fa700 -1 os/bluestore/FreelistManager.cc: In function 'int FreelistManager::release(uint64_t, uint64_t, KeyValueDB::Transaction)' thread 7f7a422fa700 time 2016-04-29 07:35:10.589011 os/bluestore/FreelistManager.cc: 245: FAILED assert(0 == "bad release overlap") ceph version 10.2.0-1 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x82) [0x7f7a528afaf2] 2: (FreelistManager::release(unsigned long, unsigned long, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x526) [0x7f7a5250dfc6] 3: (BlueStore::_txc_update_fm(BlueStore::TransContext*)+0x2f4) [0x7f7a524b3214] 4: (BlueStore::_kv_sync_thread()+0x9b8) [0x7f7a524ebd18] 5: (BlueStore::KVSyncThread::entry()+0xd) [0x7f7a5250a1dd] 6: (()+0x80a4) [0x7f7a50ca20a4] 7: (clone()+0x6d) [0x7f7a4eda787d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
I am attaching logs.
Files
Actions