Project

General

Profile

Actions

Bug #22329

closed

mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)

Added by Patrick Donnelly over 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
-
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs, multimds
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

See: /ceph/teuthology-archive/pdonnell-2017-12-05_06:48:09-fs-wip-pdonnell-testing-20171205.044504-testing-basic-smithi/1932109/remote/smithi192/log/valgrind/mon.a.log.gz

Actions #1

Updated by Greg Farnum over 6 years ago

We'll keep this here in case we see it elsewhere, but the leaks I see are of messages and the AuthSessions associated with them.

So given that you're the first reporter, these are probably leaks coming out of the MDSMonitor portion of the code...

Actions #2

Updated by Patrick Donnelly over 6 years ago

I took a look and didn't see anything going through MDSMonitor.* or FSCommand.*.

It looks like a leaked session during shutdown but I don't see how that happened nor a likely recent commit that caused it.

Actions #3

Updated by Patrick Donnelly over 6 years ago

New one:

/ceph/teuthology-archive/yuriw-2018-01-23_20:26:59-multimds-wip-yuri-testing-2018-01-22-1653-luminous-testing-basic-smithi/2103508/remote/smithi202/log/valgrind/mon.a.log.gz

Actions #4

Updated by Patrick Donnelly over 5 years ago

  • Category set to Correctness/Safety
  • Status changed from New to 12
  • Priority changed from Normal to Urgent
  • Target version set to v14.0.0
  • Start date deleted (12/06/2017)
  • ceph-qa-suite multimds added

/ceph/teuthology-archive/pdonnell-2018-09-13_04:59:57-multimds-wip-pdonnell-testing-20180913.024004-distro-basic-smithi/3014920/remote/smithi010/log/valgrind/mon.b.log.gz

Edit: different issue.

Actions #5

Updated by Patrick Donnelly over 5 years ago

  • Subject changed from mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost) to mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost, InvalidFree, InvalidWrite, InvalidRead)
Actions #6

Updated by Greg Farnum over 5 years ago

Still not seeing anything in RADOS runs AFAIK, but I did notice there might be some disparity in coverage....

13:33:31] <gregsfortytwo> batrick: also it still looks to me like we run the same valgrind config on fs and rados suites and those errors aren't popping up in rados runs
[13:34:08] <gregsfortytwo> ...although, hrm, the fs tests may stress the cluster more than our rados verify suite does
[13:34:10] <gregsfortytwo> interesting
[13:34:51] <gregsfortytwo> joshd: sage: so we only run valgrind against a mon_recovery, rados_api_tests, and rados_cls_all workloads
[13:35:24] <gregsfortytwo> there's also the singleton-flat runs but those are tagged expect_valgrind_errors so I'm not sure if they'd flag anything at us
[13:36:32] <gregsfortytwo> we thrash it a bit but nothing that would cause eg the client to reconnect
[13:37:28] <joshd> no ms error injection?
[13:38:12] <gregsfortytwo> oh, hrm, yeah msgr-failures/few.yaml is present

Actions #7

Updated by Patrick Donnelly over 5 years ago

  • Subject changed from mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost, InvalidFree, InvalidWrite, InvalidRead) to mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)

See also #36040

Actions #8

Updated by Neha Ojha over 5 years ago

Patrick, which set of logs have the (Leak_DefinitelyLost, Leak_IndirectlyLost) errors?

Actions #9

Updated by Patrick Donnelly over 5 years ago

  • Status changed from 12 to Need More Info

Neha Ojha wrote:

Patrick, which set of logs have the (Leak_DefinitelyLost, Leak_IndirectlyLost) errors?

Old logs are deleted and the most recent one is actually a different issue: #36040.

I guess we'll see if this comes back. I'll change this to Need More Info for now.

Actions #10

Updated by Patrick Donnelly over 5 years ago

  • Status changed from Need More Info to 12

/ceph/teuthology-archive/pdonnell-2018-09-23_19:17:54-fs-wip-pdonnell-testing-20180923.160923-distro-basic-smithi/3061717/remote/smithi200/log/valgrind/mon.a.log.gz

Actions #11

Updated by Neha Ojha over 5 years ago

  • Status changed from 12 to Closed

Please feel free to reopen it, if this appears again.

Actions

Also available in: Atom PDF