Project

General

Profile

Bug #37915

osd: Segmentation fault in OpRequest::_unregistered

Added by Patrick Donnelly 3 months ago.

Status:
New
Priority:
Urgent
Assignee:
-
Category:
Correctness/Safety
Target version:
Start date:
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:

Description

2019-01-12T15:47:59.044 INFO:tasks.ceph.osd.1.smithi023.stderr:*** Caught signal (Segmentation fault) **
2019-01-12T15:47:59.044 INFO:tasks.ceph.osd.1.smithi023.stderr: in thread 7f91a0a34700 thread_name:tp_osd_tp
2019-01-12T15:47:59.055 INFO:tasks.ceph.osd.1.smithi023.stderr: ceph version 14.0.1-2523-g4d17f2a (4d17f2ac07b3addb1635a75c59c81f5834d9a9a1) nautilus (dev)
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 1: (()+0x11390) [0x7f91c00f3390]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 2: (OpRequest::_unregistered()+0x14b) [0x120c39b]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 3: (TrackedOp::put()+0x211) [0x9d6e01]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 4: (PGOpItem::~PGOpItem()+0x20) [0xbfda90]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 5: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xca9) [0x9791d9]
2019-01-12T15:47:59.056 INFO:tasks.ceph.osd.1.smithi023.stderr: 6: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0xf9a83c]
2019-01-12T15:47:59.057 INFO:tasks.ceph.osd.1.smithi023.stderr: 7: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xf9d9f0]
2019-01-12T15:47:59.057 INFO:tasks.ceph.osd.1.smithi023.stderr: 8: (()+0x76ba) [0x7f91c00e96ba]
2019-01-12T15:47:59.057 INFO:tasks.ceph.osd.1.smithi023.stderr: 9: (clone()+0x6d) [0x7f91bf6f041d]

From: /ceph/teuthology-archive/pdonnell-2019-01-12_05:29:15-fs-wip-pdonnell-testing-20190112.024010-distro-basic-smithi/3452098/teuthology.log

I don't think PRs in this test run were related; but one did touch TrackedOp.[h,cc]: https://github.com/ceph/ceph/pull/25921

Poking around this code, one thing that sticks out to me is that the RWLock is only ever locked for reads when writes are being done. I'm going to submit a PR to fix that but the problem may be unrelated.

Also available in: Atom PDF