Bug #759: osd: pgs spend a long time peering when marking osds out - Ceph - Ceph

Actions

Copy link

Bug #759

closed

osd: pgs spend a long time peering when marking osds out

Added by Sage Weil about 13 years ago. Updated about 13 years ago.

Status:

Resolved

Priority:

High

Assignee:

Samuel Just

Category:

Target version:

v0.24.3

% Done:

Spent time:

1:00 h

Source:

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

On the playground (with lots of data), I see that some PGs spend a long time in peering state after marking an OSD as out. This isn't supposed to happen...

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Sage Weil about 13 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Sage Weil about 13 years ago

this appears to be scrubbing related:

- we get a new osdmap. handle_osd_map tries to pause the op threadpool.
- a long running scrub op takes forever to complete
- handle_osd_map finally continues.

during that whole time the main dispatch thread is blocked up, and peering gets backed up as a result.

Actions

Copy link

Updated by Sage Weil about 13 years ago

Assignee changed from Sage Weil to Samuel Just

the replica scrub needs to go in a different work queue (not op_wq). scrub_wq, or something else that's assigned to the disk threadpool disk_tp.

Actions

Copy link

Updated by Samuel Just about 13 years ago

1a01e5ee1b88a217547873296e0371858be13f37 merged in a branch moving replica scrubbing to rep_scrub_wq with a new non-osdop message for initiating a replica scrub. Scrub still blocks in the disk_tp while waiting for replicas to scrub, though, working on that now.

Actions

Copy link

Updated by Sage Weil about 13 years ago

Status changed from In Progress to Resolved

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #759

osd: pgs spend a long time peering when marking osds out

Updated by Sage Weil about 13 years ago

Updated by Sage Weil about 13 years ago

Updated by Sage Weil about 13 years ago

Updated by Samuel Just about 13 years ago

Updated by Sage Weil about 13 years ago