Bug #17859: filestore: can get stuck in an unbounded loop during scrub - Ceph - Ceph

Actions

Copy link

Bug #17859

closed

filestore: can get stuck in an unbounded loop during scrub

Added by Sage Weil over 7 years ago. Updated over 7 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Category:

OSD

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

jewel

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

In list_by_hash_{bitwise,nibblewise} there is a condition

    if (cmp_bitwise(j->second, end) >= 0) {
      if (next)
        *next = ghobject_t::get_max();
      return 0;
    }

if we set next to max, the caller doesn't break out of the loop and will continue on to iterate over every subsequent subdir in the collection. this can either be very slow or can make the osd suicide.

This was added during a refactor in 921c4586f165ce39c17ef8b579c548dc8f6f4500. I'm pretty sure it should just set *next = j->second instead.

big big thanks to mistur on irc for helping narrow this down.

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #17859

filestore: can get stuck in an unbounded loop during scrub

Updated by Sage Weil over 7 years ago

Updated by Sage Weil over 7 years ago

Updated by Sage Weil over 7 years ago

Updated by Sage Weil over 7 years ago

Updated by Sage Weil over 7 years ago

Updated by Loïc Dachary over 7 years ago

Updated by Sage Weil over 7 years ago

Updated by Loïc Dachary over 7 years ago