Project

General

Profile

Actions

Bug #20419

closed

OSD aborts when shutting down

Added by Kefu Chai almost 7 years ago. Updated almost 7 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/kchai-2017-06-25_17:19:05-rados-wip-kefu-testing---basic-smithi/1324712/remote/smithi006/log/ceph-osd.3.log.gz

    -2> 2017-06-25 18:02:34.348034 3ade1700 30 osd.3 pg_epoch: 782 pg[0.50( empty local-lis/les=779/781 n=0 ec=671/2 lis/c 779/755 les/c/f 781/757/0 779/779/779) [3] r=0 lpr=779 pi
=[755,779)/1 crt=0'0 mlcod 0'0 active+undersized+degraded] lock
    -1> 2017-06-25 18:02:34.349151 3ade1700 -1 osd.3 782 pgid 0.50 has ref count of 2
     0> 2017-06-25 18:02:34.398655 3ade1700 -1 *** Caught signal (Aborted) **
 in thread 3ade1700 thread_name:signal_handler

 ceph version 12.0.3-2085-g2e6b413 (2e6b413379e08b5b838b609705fc07d2a938beaa) luminous (dev)
 1: (()+0x9e903f) [0xaf103f]
 2: (()+0xf370) [0xc7e1370]
 3: (gsignal()+0x37) [0xd65e1d7]
 4: (abort()+0x148) [0xd65f8c8]
 5: (OSD::shutdown()+0x18da) [0x5d27aa]
 6: (OSD::handle_signal(int)+0x11f) [0x5d2bcf]
 7: (SignalHandler::entry()+0x1d7) [0xaf23a7]
 8: (()+0x7dc5) [0xc7d9dc5]
 9: (clone()+0x6d) [0xd72073d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Related issues 1 (0 open1 closed)

Is duplicate of RADOS - Bug #20432: pgid 0.7 has ref count of 2ResolvedKefu Chai06/27/2017

Actions
Actions #1

Updated by Kefu Chai almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (OSD)
  • Component(RADOS) OSD added
Actions #3

Updated by Kefu Chai almost 7 years ago

  • Subject changed from leak in osd: in FileStore::mount() to OSD aborts when shutting down
  • Description updated (diff)
Actions #4

Updated by Kefu Chai almost 7 years ago

so somebody was still holding a reference to pg 0.50 when OSD was trying to kick it.

Actions #5

Updated by Kefu Chai almost 7 years ago

sage suspects that it could be regression: we switched the order of shutting down recently.

Actions #6

Updated by Kefu Chai almost 7 years ago

  • Is duplicate of Bug #20432: pgid 0.7 has ref count of 2 added
Actions #7

Updated by Kefu Chai almost 7 years ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF