Project

General

Profile

Actions

Bug #38431

closed

osd: leaked pg refs on shutdown

Added by Sage Weil about 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/sage-2019-02-21_21:52:17-rados-wip-sage3-testing-2019-02-21-1359-distro-basic-smithi/3622562

w/ pg ref logs

Actions #1

Updated by Sage Weil about 5 years ago

This appears to be as simple as a queued write in progress when shutdown happens:

2019-02-21 23:20:26.232 3b891700  1 PG::get 0x3d2f5f70 1 -> 2
2019-02-21 23:20:26.232 3b891700 20 osd.5 op_wq(6) _process 27.6 to_process <OpQueueItem(27.6 PGOpItem(op=osd_repop(client.5593.0:534 27.6 e172/158) v2) prio 127 cost 839 e172)> waiting <> waiting_peering {}
2019-02-21 23:20:26.232 3b891700 20 osd.5 op_wq(6) _process OpQueueItem(27.6 PGOpItem(op=osd_repop(client.5593.0:534 27.6 e172/158) v2) prio 127 cost 839 e172) pg 0x3d2f5f70
2019-02-21 23:20:26.232 3b891700  1 PG::get 0x3d2f5f70 2 -> 3
...
2019-02-21 23:20:26.258 3b891700  1 PG::get 0x3d2f5f70 3 -> 4
...
2019-02-21 23:20:26.261 3b891700 20 bluestore(/var/lib/ceph/osd/ceph-5) _txc_create osr 0x221b1430 = 0x219161a0 seq 157
...
2019-02-21 23:20:26.269 3b891700 10 osd.5 172 dequeue_op 0x173df170 finish
2019-02-21 23:20:26.269 3b891700  1 RefCountedObject::put 0x26e14100 2 -> 1
2019-02-21 23:20:26.269 3b891700  1 PG::put 0x3d2f5f70 4 -> 3
2019-02-21 23:20:26.269 3b891700  1 PG::put 0x3d2f5f70 3 -> 2
...
2019-02-21 23:20:26.282 15fbf700  0 osd.5 172 shutdown
...
2019-02-21 23:20:26.316 27869700 20 bluestore(/var/lib/ceph/osd/ceph-5) _kv_finalize_thread kv_committed <0x219161a0>
2019-02-21 23:20:26.316 27869700 20 bluestore(/var/lib/ceph/osd/ceph-5) _kv_finalize_thread deferred_stable <>
2019-02-21 23:20:26.316 27869700 10 bluestore(/var/lib/ceph/osd/ceph-5) _txc_state_proc txc 0x219161a0 kv_submitted
2019-02-21 23:20:26.316 27869700 20 bluestore(/var/lib/ceph/osd/ceph-5) _txc_committed_kv txc 0x219161a0
2019-02-21 23:20:26.316 27869700 10 bluestore(/var/lib/ceph/osd/ceph-5) _txc_state_proc txc 0x219161a0 finishing
2019-02-21 23:20:26.316 27869700 20 bluestore(/var/lib/ceph/osd/ceph-5) _txc_finish 0x219161a0 onodes 0x2228ba30
2019-02-21 23:20:26.316 27869700  1 RefCountedObject::get 0x221b1430 2 -> 3
2019-02-21 23:20:26.316 27869700 20 bluestore(/var/lib/ceph/osd/ceph-5) _txc_finish  txc 0x219161a0 done
2019-02-21 23:20:26.316 27869700 20 bluestore(/var/lib/ceph/osd/ceph-5) _txc_finish osr 0x221b1430 q now empty
2019-02-21 23:20:26.316 27869700 10 bluestore(/var/lib/ceph/osd/ceph-5) _txc_release_alloc(queued) 0x219161a0 []

with a net of +1 ref because we have the op bottom half queued back in op_wq.

but we're shutting down, so it doesn't get processed.

Actions #2

Updated by Sage Weil about 5 years ago

  • Status changed from 12 to Fix Under Review
Actions #3

Updated by Sage Weil about 5 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF