Project

General

Profile

Bug #37915

osd: Segmentation fault in OpRequest::_unregistered

Added by Patrick Donnelly about 5 years ago. Updated over 4 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2019-01-12T15:47:59.044 INFO:tasks.ceph.osd.1.smithi023.stderr:*** Caught signal (Segmentation fault) **
2019-01-12T15:47:59.044 INFO:tasks.ceph.osd.1.smithi023.stderr: in thread 7f91a0a34700 thread_name:tp_osd_tp
2019-01-12T15:47:59.055 INFO:tasks.ceph.osd.1.smithi023.stderr: ceph version 14.0.1-2523-g4d17f2a (4d17f2ac07b3addb1635a75c59c81f5834d9a9a1) nautilus (dev)
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 1: (()+0x11390) [0x7f91c00f3390]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 2: (OpRequest::_unregistered()+0x14b) [0x120c39b]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 3: (TrackedOp::put()+0x211) [0x9d6e01]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 4: (PGOpItem::~PGOpItem()+0x20) [0xbfda90]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 5: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xca9) [0x9791d9]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 6: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0xf9a83c]
2019-01-12T15:47:59.057 INFO:tasks.ceph.osd.1.smithi023.stderr: 7: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xf9d9f0]
2019-01-12T15:47:59.057 INFO:tasks.ceph.osd.1.smithi023.stderr: 8: (()+0x76ba) [0x7f91c00e96ba]
2019-01-12T15:47:59.057 INFO:tasks.ceph.osd.1.smithi023.stderr: 9: (clone()+0x6d) [0x7f91bf6f041d]

From: /ceph/teuthology-archive/pdonnell-2019-01-12_05:29:15-fs-wip-pdonnell-testing-20190112.024010-distro-basic-smithi/3452098/teuthology.log

I don't think PRs in this test run were related; but one did touch TrackedOp.[h,cc]: https://github.com/ceph/ceph/pull/25921

Poking around this code, one thing that sticks out to me is that the RWLock is only ever locked for reads when writes are being done. I'm going to submit a PR to fix that but the problem may be unrelated.

History

#1 Updated by Greg Farnum over 4 years ago

  • Status changed from New to Can't reproduce

There have been changes to TrackedOps since then.

Also available in: Atom PDF