Project

General

Profile

Actions

Bug #37808

open

osd: osdmap cache weak_refs assert during shutdown

Added by Patrick Donnelly over 5 years ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2019-01-07T06:03:50.666 INFO:tasks.ceph.osd.2.smithi103.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.0.1-2336-g6e9ba79/rpm/el7/BUILD/ceph-14.0.1-2336-g6e9ba79/src/common/shared_cache.hpp: In function 'SharedLRU<K, V>::~SharedLRU() [with K = unsigned int; V = const OSDMap]' thread 9c7f180 time 2019-01-07 06:03:50.661100
2019-01-07T06:03:50.666 INFO:tasks.ceph.osd.2.smithi103.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.0.1-2336-g6e9ba79/rpm/el7/BUILD/ceph-14.0.1-2336-g6e9ba79/src/common/shared_cache.hpp: 121: FAILED ceph_assert(weak_refs.empty())
2019-01-07T06:03:50.696 INFO:tasks.ceph.osd.2.smithi103.stderr: ceph version 14.0.1-2336-g6e9ba79 (6e9ba79dc0e0f1cd25e3405a13182db3bf489ce9) nautilus (dev)
2019-01-07T06:03:50.696 INFO:tasks.ceph.osd.2.smithi103.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x55338c]
2019-01-07T06:03:50.697 INFO:tasks.ceph.osd.2.smithi103.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x55355a]
2019-01-07T06:03:50.697 INFO:tasks.ceph.osd.2.smithi103.stderr: 3: (()+0x5fe03e) [0x70603e]
2019-01-07T06:03:50.697 INFO:tasks.ceph.osd.2.smithi103.stderr: 4: (OSDService::~OSDService()+0x163) [0x6b8443]
2019-01-07T06:03:50.697 INFO:tasks.ceph.osd.2.smithi103.stderr: 5: (OSD::~OSD()+0x2de) [0x6bd59e]
2019-01-07T06:03:50.697 INFO:tasks.ceph.osd.2.smithi103.stderr: 6: (OSD::~OSD()+0x9) [0x6bdb89]
2019-01-07T06:03:50.697 INFO:tasks.ceph.osd.2.smithi103.stderr: 7: (main()+0x1675) [0x556ee5]
2019-01-07T06:03:50.697 INFO:tasks.ceph.osd.2.smithi103.stderr: 8: (__libc_start_main()+0xf5) [0xe6dc445]
2019-01-07T06:03:50.698 INFO:tasks.ceph.osd.2.smithi103.stderr: 9: (()+0x53f445) [0x647445]
2019-01-07T06:03:50.698 INFO:tasks.ceph.osd.2.smithi103.stderr:2019-01-07 06:03:50.656 9c7f180 -1 leaked refs:
2019-01-07T06:03:50.699 INFO:tasks.ceph.osd.2.smithi103.stderr:dump_weak_refs 0x138b58f0 weak_refs: 36 = 0x417c2b90 with 1 refs

From: /ceph/teuthology-archive/pdonnell-2019-01-05_03:44:59-multimds-wip-pdonnell-testing-20190105.003832-distro-basic-smithi/3426727/teuthology.log


Related issues 1 (0 open1 closed)

Has duplicate Ceph - Bug #38405: FAILED ceph_assert(weak_refs.empty()) on osd shutdownDuplicate02/20/2019

Actions
Actions #1

Updated by Greg Farnum over 5 years ago

  • Subject changed from osd: crash under valgrind during shutdown to osd: osdmap cache weak_refs assert during shutdown

Apparently these are popping up again so we should try and track them down, but note that since https://github.com/ceph/ceph/pull/18201 this won't be visible to users, only testers.

Actions #2

Updated by Josh Durgin about 5 years ago

  • Priority changed from High to Normal
Actions #3

Updated by Sage Weil about 5 years ago

  • Priority changed from Normal to High

/a/sage-2019-02-06_15:56:42-rados-wip-msgr2-peer-addr-distro-basic-smithi/3557216

rados/singleton-flat/valgrind-leaks.yaml

Actions #4

Updated by Sage Weil about 5 years ago

/a/sage-2019-02-13_00:42:53-rados-wip-sage2-testing-2019-02-12-1700-distro-basic-smithi/3581795

Actions #5

Updated by Sage Weil about 5 years ago

  • Status changed from New to 12
Actions #6

Updated by Sage Weil about 5 years ago

/a/sage-2019-02-22_15:54:54-rados-wip-sage2-testing-2019-02-22-0711-distro-basic-smithi/3626248

Actions #7

Updated by Patrick Donnelly about 5 years ago

/ceph/teuthology-archive/pdonnell-2019-02-26_07:49:50-multimds-wip-pdonnell-testing-20190226.051327-distro-basic-smithi/3641128/teuthology.log

Actions #8

Updated by Casey Bodley about 5 years ago

  • Has duplicate Bug #38405: FAILED ceph_assert(weak_refs.empty()) on osd shutdown added
Actions #9

Updated by Casey Bodley about 5 years ago

the rgw/multisite suite has been reproducing this reliably - probably because it runs with 'wait-for-scrub: false'

Actions #10

Updated by Greg Farnum about 5 years ago

Argh, I can't find it now but I'm pretty sure I saw a PR go by that purported to fix this. The claimed issue is that we're holding refs in our scrub Contexts that don't get cleaned up on shutdown.

Actions #11

Updated by Greg Farnum over 4 years ago

  • Status changed from 12 to Can't reproduce
Actions #12

Updated by Patrick Donnelly about 3 years ago

  • Status changed from Can't reproduce to New
  • Target version deleted (v14.0.0)

/ceph/teuthology-archive/pdonnell-2021-03-02_17:29:53-fs:verify-wip-pdonnell-testing-20210301.234318-distro-basic-smithi/5927727/teuthology.log

Actions #13

Updated by Patrick Donnelly about 3 years ago

/ceph/teuthology-archive/pdonnell-2021-04-15_01:35:57-fs-wip-pdonnell-testing-20210414.230315-distro-basic-smithi/6047728/teuthology.log

Actions #14

Updated by Neha Ojha about 2 years ago

  • Priority changed from High to Normal
Actions #15

Updated by Laura Flores over 1 year ago

  • Backport set to pacific

/a/yuriw-2022-07-27_22:35:53-rados-wip-yuri8-testing-2022-07-27-1303-pacific-distro-default-smithi/6950918

Actions #16

Updated by Radoslaw Zarzynski over 1 year ago

  • Backport changed from pacific to pacific,quincy
Actions

Also available in: Atom PDF