Actions
Bug #41346
closedmds: MDSIOContextBase instance leak
Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Community (dev)
Tags:
Backport:
nautilus,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
From time to time, we see mds crushes when shutting down:
#0 0x00002adc071141a0 in std::ostream::operator<<(unsigned int) () from /lib64/libstdc++.so.6 #1 0x00002adbfc818169 in operator<< (out=..., e=...) at /home/xuxuehan/ceph-orig/src/common/escape.cc:278 #2 0x00002adbfc6796c3 in ceph::JSONFormatter::print_quoted_string (this=0x7ffd366c74b0, s=...) at /home/xuxuehan/ceph-orig/src/common/Formatter.cc:167 #3 0x00002adbfc679e4f in ceph::JSONFormatter::add_value (this=0x7ffd366c74b0, name=0x55a09b79d518 "entity_name", val=..., quoted=true) at /home/xuxuehan/ceph-orig/src/common/Formatter.cc:280 #4 0x00002adbfc679f19 in ceph::JSONFormatter::dump_string (this=0x7ffd366c74b0, name=0x55a09b79d518 "entity_name", s=...) at /home/xuxuehan/ceph-orig/src/common/Formatter.cc:301 #5 0x000055a09b6913b7 in handle_fatal_signal (signum=11) at /home/xuxuehan/ceph-orig/src/global/signal_handler.cc:200 #6 <signal handler called> #7 0x00002adc06610c30 in pthread_mutex_lock () from /lib64/libpthread.so.0 #8 0x00002adbfc6eae4f in __gthread_mutex_lock (__mutex=0x38) at /opt/rh/devtoolset-8/root/usr/include/c++/8/x86_64-redhat-linux/bits/gthr-default.h:748 #9 0x00002adbfc6eaecc in std::mutex::lock (this=0x38) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_mutex.h:103 #10 0x00002adbfc6ec765 in std::unique_lock<std::mutex>::lock (this=0x7ffd366cd5a0) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_mutex.h:267 #11 0x00002adbfc6ebed6 in std::unique_lock<std::mutex>::unique_lock (this=0x7ffd366cd5a0, __m=...) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/std_mutex.h:197 #12 0x00002adbfcaef351 in ceph::logging::Log::submit_entry(ceph::logging::Entry&&) (this=0x0, e=<unknown type in /home/xuxuehan/ceph-orig/build/lib/libceph-common.so.0, CU 0x3c1c626, DIE 0x3c6a97c>) at /home/xuxuehan/ceph-orig/src/log/Log.cc:180 #13 0x000055a09b5a31ac in elist<MDSIOContextBase*>::~elist (this=0x55a09bbfe040 <MDSIOContextBase::ctx_list>, __in_chrg=<optimized out>) at /home/xuxuehan/ceph-orig/src/include/elist.h:95 #14 0x00002adc078d5c29 in __run_exit_handlers () from /lib64/libc.so.6 #15 0x00002adc078d5c77 in exit () from /lib64/libc.so.6 #16 0x00002adc078be49c in __libc_start_main () from /lib64/libc.so.6 #17 0x000055a09b0ecb19 in _start ()
After debugging, we believe this is due to MDSIOContextBase::complete not deleting itself during the shutdown process.
Updated by Patrick Donnelly over 4 years ago
- Status changed from New to Fix Under Review
- Assignee set to Zheng Yan
- Start date deleted (
08/20/2019) - Backport set to nautilus,mimic
Updated by Patrick Donnelly over 4 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41851: nautilus: mds: MDSIOContextBase instance leak added
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41852: mimic: mds: MDSIOContextBase instance leak added
Updated by Nathan Cutler over 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Updated by Patrick Donnelly about 4 years ago
- Related to Bug #44295: mds: MDCache.cc: 6400: FAILED ceph_assert(r == 0 || r == -2) added
Actions