Bug #22467: osd boot has stuck for 10min because of clear_temp_object - bluestore - Ceph

Actions

Copy link

Bug #22467

closed

osd boot has stuck for 10min because of clear_temp_object

Added by zhongshuai huang over 6 years ago. Updated over 6 years ago.

Status:

Can't reproduce

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v12.2.1

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

rebooting a host,osd will run to the function of osd::init => clear_temp_objects => collection_list => _collection_list => ( it = db->get_iterator(PREFIX_PBJ) ; it->lower_bound(temp_start_key) ) for each pg,the time of it->lower_bound(temp_start_key) is about 7s,if each osd has 100pgs,this cost of clear_temp_object will be 10min

Files

新建文本文档.txt (12.7 KB) 新建文本文档.txt

zhongshuai huang, 12/18/2017 03:43 AM

Actions

Copy link

Updated by John Spray over 6 years ago

Project changed from Ceph to RADOS

Actions

Copy link

Updated by jinxiang cheng over 6 years ago

i always got this problem , restart ceph cluster ,the OSD witch have 15000pgs ,cost almost 25min to become up state. After trace OSD LOG, i got this:osd::init => clear_temp_objects => collection_list => _collection_list ,just like yours. and if you got a solution about this problem?

Actions

Copy link

Updated by Josh Durgin over 6 years ago

Project changed from RADOS to bluestore

This is bluestore, right? It sounds like you've got too large/slow a rocksdb - you want that metadata on an ssd.

Actions

Copy link

Updated by jinxiang cheng over 6 years ago

Josh Durgin wrote:

This is bluestore, right? It sounds like you've got too large/slow a rocksdb - you want that metadata on an ssd.

Filestore. and why clear this temp objects during OSD initialize process, and that should be optimizede if a large number of pgs on one OSD.

Actions

Copy link

Updated by Sage Weil over 6 years ago

Status changed from New to Can't reproduce

25,000 is way too many PGs for one osd. I suspect the problem is that the cache for leveldb or rocksdb is way to small to accomodate that many PGs. Try increasing leveldb_cache_size or rocksdb_cache_size to be 10x bigger (the defaults are only 128MB or 256MB or similar).

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » bluestore

Custom queries

Bug #22467

osd boot has stuck for 10min because of clear_temp_object

Updated by John Spray over 6 years ago

Updated by jinxiang cheng over 6 years ago

Updated by Josh Durgin over 6 years ago

Updated by jinxiang cheng over 6 years ago

Updated by Sage Weil over 6 years ago