Bug #22329
closed
mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
Added by Patrick Donnelly over 6 years ago.
Updated over 5 years ago.
Category:
Correctness/Safety
ceph-qa-suite:
fs, multimds
Description
See: /ceph/teuthology-archive/pdonnell-2017-12-05_06:48:09-fs-wip-pdonnell-testing-20171205.044504-testing-basic-smithi/1932109/remote/smithi192/log/valgrind/mon.a.log.gz
We'll keep this here in case we see it elsewhere, but the leaks I see are of messages and the AuthSessions associated with them.
So given that you're the first reporter, these are probably leaks coming out of the MDSMonitor portion of the code...
I took a look and didn't see anything going through MDSMonitor.* or FSCommand.*.
It looks like a leaked session during shutdown but I don't see how that happened nor a likely recent commit that caused it.
New one:
/ceph/teuthology-archive/yuriw-2018-01-23_20:26:59-multimds-wip-yuri-testing-2018-01-22-1653-luminous-testing-basic-smithi/2103508/remote/smithi202/log/valgrind/mon.a.log.gz
- Category set to Correctness/Safety
- Status changed from New to 12
- Priority changed from Normal to Urgent
- Target version set to v14.0.0
- Start date deleted (
12/06/2017)
- ceph-qa-suite multimds added
/ceph/teuthology-archive/pdonnell-2018-09-13_04:59:57-multimds-wip-pdonnell-testing-20180913.024004-distro-basic-smithi/3014920/remote/smithi010/log/valgrind/mon.b.log.gz
Edit: different issue.
- Subject changed from mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost) to mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost, InvalidFree, InvalidWrite, InvalidRead)
Still not seeing anything in RADOS runs AFAIK, but I did notice there might be some disparity in coverage....
13:33:31] <gregsfortytwo> batrick: also it still looks to me like we run the same valgrind config on fs and rados suites and those errors aren't popping up in rados runs
[13:34:08] <gregsfortytwo> ...although, hrm, the fs tests may stress the cluster more than our rados verify suite does
[13:34:10] <gregsfortytwo> interesting
[13:34:51] <gregsfortytwo> joshd: sage: so we only run valgrind against a mon_recovery, rados_api_tests, and rados_cls_all workloads
[13:35:24] <gregsfortytwo> there's also the singleton-flat runs but those are tagged expect_valgrind_errors so I'm not sure if they'd flag anything at us
[13:36:32] <gregsfortytwo> we thrash it a bit but nothing that would cause eg the client to reconnect
[13:37:28] <joshd> no ms error injection?
[13:38:12] <gregsfortytwo> oh, hrm, yeah msgr-failures/few.yaml is present
- Subject changed from mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost, InvalidFree, InvalidWrite, InvalidRead) to mon: Valgrind: mon (Leak_DefinitelyLost, Leak_IndirectlyLost)
Patrick, which set of logs have the (Leak_DefinitelyLost, Leak_IndirectlyLost) errors?
- Status changed from 12 to Need More Info
Neha Ojha wrote:
Patrick, which set of logs have the (Leak_DefinitelyLost, Leak_IndirectlyLost) errors?
Old logs are deleted and the most recent one is actually a different issue: #36040.
I guess we'll see if this comes back. I'll change this to Need More Info for now.
- Status changed from Need More Info to 12
/ceph/teuthology-archive/pdonnell-2018-09-23_19:17:54-fs-wip-pdonnell-testing-20180923.160923-distro-basic-smithi/3061717/remote/smithi200/log/valgrind/mon.a.log.gz
- Status changed from 12 to Closed
Please feel free to reopen it, if this appears again.
Also available in: Atom
PDF