Project

General

Profile

Actions

Bug #21180

closed

Bluestore throttler causes down OSD

Added by Henrik Korkuc over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Writing large amount of data to EC RBD pool via NBD causes down OSDs, PGs and drop in traffic due to unhealthy cluster. OSDs themself are running, disks seem to be idle. In logs "heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7f05d4c90700' had timed out after 60" (I increased it from 15 during testing) and slow requests can be observed.

Setting bluestore_throttle_bytes to 0 resolves issue.

Attaching gdb thread backtrace of one of OSDs.


Files

gdb.txt.gz (6.59 KB) gdb.txt.gz Henrik Korkuc, 08/30/2017 03:23 PM

Related issues 1 (0 open1 closed)

Related to RADOS - Bug #21171: bluestore: aio submission deadlockResolvedSage Weil08/29/2017

Actions
Actions #1

Updated by Henrik Korkuc over 6 years ago

just an update - sometimes even with bluestore_throttle_bytes set to 0 I get down OSDs, but it is much more rare and usually recovers

Actions #2

Updated by Sage Weil over 6 years ago

Can you try setting bluestore_deferred_throttle_bytes = 0 along with bluestore_throttle_bytes = 0 and see if that resolves it? Thanks!

Actions #3

Updated by Sage Weil over 6 years ago

  • Status changed from New to Need More Info
Actions #4

Updated by Sage Weil over 6 years ago

  • Related to Bug #21171: bluestore: aio submission deadlock added
Actions #5

Updated by Henrik Korkuc over 6 years ago

pool used for this workload is blocked by down PG (#21287), but I'll try to replicate on same cluster with newly created pool

Actions #6

Updated by Sage Weil over 6 years ago

  • Status changed from Need More Info to Resolved

Pretty sure this was #21171, fixed merged to master and luminous, will be in 12.2.1.

Actions

Also available in: Atom PDF