Project

General

Profile

Actions

Bug #9895

closed

Master/giant branch: OSD deadlock during recovery

Added by Andrey Korolyov over 9 years ago. Updated over 9 years ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Given eight-OSD, two-node cluster (node01 and node04), three mons (node01, node04, twin2). OSDs placed on node04 acts as a data senders, backfilling kv-based OSDs on node01. Sometimes one of those senders reaches deadlock (strace and thread dump attached), leaving cluster effectively (and making it stale due reaching zero online copies for some objects).

ceph.conf.gz can be found there: http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20141022/33c6c485/attachment-0003.bin

ceph version 0.86-104-gb05efdd (b05efddb77290b86eb5c150776c761ab84f66f37)


Files

ceph-osd-9-stale.threaddump.txt (57.2 KB) ceph-osd-9-stale.threaddump.txt backtrace Andrey Korolyov, 10/26/2014 11:24 AM
ceph-state-map-osd9-stale.txt (6.95 KB) ceph-state-map-osd9-stale.txt ceph -s output Andrey Korolyov, 10/26/2014 11:24 AM
osd9-out.strace.txt (62 Bytes) osd9-out.strace.txt strace output Andrey Korolyov, 10/26/2014 11:24 AM

Related issues 1 (0 open1 closed)

Is duplicate of Messengers - Bug #9898: osd: fast dispatch deadlock in mark_down (giant)Resolved10/26/2014

Actions
Actions #1

Updated by Sage Weil over 9 years ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF