Project

General

Profile

Actions

Bug #15548

open

scrub ops block forever after pool deletion

Added by Dan van der Ster about 8 years ago. Updated almost 7 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Scrub/Repair
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We deleted a pool and now have 122 ops blocked (on that pool), mostly related to scrubs that were ongoing when the pool was deleted.
The ops seem to be blocked forever, but we have yet tried restarting the daemons.

Everything is running v10.1.2:

# ceph daemon osd.416 version
{"version":"10.1.2"}

Here are some examples -- mostly scrub related, a few pg_scan:

        {
            "description": "replica scrub(pg: 3.29cs8,from:0'0,to:9502'22028,epoch:11989,start:3:394324fd::::0,end:3:394332b5::::0,chunky:1,deep:1,seed:4294967295,version:6)",
            "initiated_at": "2016-04-20 12:05:37.081236",
            "age": 3778.346564,
            "duration": 3778.346589,
            "type_data": [
                "no flag points reached",
                [
                    {
                        "time": "2016-04-20 12:05:37.081236",
                        "event": "initiated" 
                    }
                ]
            ]
        }

Note that this PG has been deleted from the OSD:

# ls /var/lib/ceph/osd/ceph-416/current/ | grep 3.29
#

Here are more examples:

        {
            "description": "osd_sub_op(unknown.0.0:0 3.11cs1 MIN [scrub-reserve] v 0'0 snapset=0=[]:[])",
            "initiated_at": "2016-04-20 12:05:39.159059",
            "age": 4056.589068,
            "duration": 4056.589100,
            "type_data": [
                "no flag points reached",
                [
                    {
                        "time": "2016-04-20 12:05:39.159059",
                        "event": "initiated" 
                    }
                ]
            ]
        }

        {
            "description": "pg_scan(get_digest 3.f14s4 MIN-MIN e 11989\/11989)",
            "initiated_at": "2016-04-20 12:05:41.972286",
            "age": 4224.109025,
            "duration": 4224.109046,
            "type_data": [
                "no flag points reached",
                [
                    {
                        "time": "2016-04-20 12:05:41.972286",
                        "event": "initiated" 
                    }
                ]
            ]
        }

        {
            "description": "MOSDECSubOpReadReply(3.5a9s0 11989 ECSubReadReply(tid=49006, attrs_read=0))",
            "initiated_at": "2016-04-20 12:05:36.610882",
            "age": 4229.379564,
            "duration": 4229.379585,
            "type_data": [
                "no flag points reached",
                [
                    {
                        "time": "2016-04-20 12:05:36.610882",
                        "event": "initiated" 
                    }
                ]
            ]
        }

        {
            "description": "MOSDECSubOpRead(3.851s5 11989 ECSubRead(tid=6309, to_read={3:8a13e8a6:::1103272512@lxc2cert5.33704982462.0000000000000006:head=1048576,1048576,0}, attrs_to_read=))",
            "initiated_at": "2016-04-20 12:05:36.481340",
            "age": 4231.033958,
            "duration": 4231.033986,
            "type_data": [
                "no flag points reached",
                [
                    {
                        "time": "2016-04-20 12:05:36.481340",
                        "event": "initiated" 
                    }
                ]
            ]
        }
Actions #1

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category changed from OSD to Scrub/Repair
  • Component(RADOS) OSD added
Actions

Also available in: Atom PDF