Bug #50743
*: crash in pthread_getname_np
0%
7a4dc07de98f9fe5951207ab4b4b599270f9729af0b338a4212d3bf9335cf310
8032fa5f1f2107af12b68e6fe2586cc5fe2c1ad99a09d4b1bbfa47a813852039
25c187a52a0bd6185eea9df828445b7bd639a28947da47ae5869697eb9e1ec89
280c1c6704dab0acc19e650eadfb3ccff4f9a44af5cd64b22521731c6c34b09c
ae2a86a8e0ea3bdd90d99e99346e48bfc31a9b6d07ce9185847944499a5c5e86
c557eb5113a5ee72a6fd4463538af6975ade7e57697b33b27f877f3595275368
Description
Sanitized backtrace:
/lib64/libpthread.so.0( pthread_getname_np() ceph::logging::Log::dump_recent() MDSDaemon::respawn() Context::complete(int) MDSRank::respawn() MDSRank::handle_write_error(int) /usr/bin/ceph-mds( Context::complete(int) Finisher::finisher_thread_entry() /lib64/libpthread.so.0( clone()
Crash dump sample:
{ "backtrace": [ "/lib64/libpthread.so.0(+0x12b20) [0x7fca48244b20]", "pthread_getname_np()", "(ceph::logging::Log::dump_recent()+0x4b3) [0x7fca4980c643]", "(MDSDaemon::respawn()+0x15b) [0x55ca473547fb]", "(Context::complete(int)+0xd) [0x55ca4735c9ed]", "(MDSRank::respawn()+0x1c) [0x55ca4736289c]", "(MDSRank::handle_write_error(int)+0x1a6) [0x55ca47366a26]", "/usr/bin/ceph-mds(+0x1bef20) [0x55ca47366f20]", "(Context::complete(int)+0xd) [0x55ca4735c9ed]", "(Finisher::finisher_thread_entry()+0x1a5) [0x7fca495380b5]", "/lib64/libpthread.so.0(+0x814a) [0x7fca4823a14a]", "clone()" ], "ceph_version": "16.2.1", "crash_id": "2021-05-08T02:06:04.479049Z_c595a994-0782-4309-9f1b-e2f7dedaf821", "entity_name": "mds.20297e33b547c2f168d55f24c5d7328709e9b647", "os_id": "centos", "os_name": "CentOS Linux", "os_version": "8", "os_version_id": "8", "process_name": "ceph-mds", "stack_sig": "7a4dc07de98f9fe5951207ab4b4b599270f9729af0b338a4212d3bf9335cf310", "timestamp": "2021-05-08T02:06:04.479049Z", "utsname_machine": "x86_64", "utsname_release": "5.4.0-70-generic", "utsname_sysname": "Linux", "utsname_version": "#78-Ubuntu SMP Fri Mar 19 13:29:52 UTC 2021" }
History
#1 Updated by Yaarit Hatuka almost 3 years ago
#2 Updated by Neha Ojha almost 3 years ago
- Project changed from RADOS to CephFS
#3 Updated by Patrick Donnelly almost 3 years ago
- Subject changed from crash: /lib64/libpthread.so.0( to mds: crash in pthread_getname_np
- Status changed from New to Need More Info
How do we know what the signal number was? Not clear to me what to do with this. I don't see anything obviously wrong with the call to pthread_getname_np in Log::dump_recent.
#4 Updated by Yaarit Hatuka almost 3 years ago
Hi Patrick,
We don't have the signal number yet in the telemetry crash reports.
You can see other crash events with the same signature here:
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=8032fa5f1f2107af12b68e6fe2586cc5fe2c1ad99a09d4b1bbfa47a813852039
in the Crashes table:
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?viewPanel=26&orgId=1&var-sig_v2=8032fa5f1f2107af12b68e6fe2586cc5fe2c1ad99a09d4b1bbfa47a813852039
If it seems like this is not a bug after all, you can change the status to Rejected or Won't Fix.
#5 Updated by Patrick Donnelly almost 3 years ago
Yaarit Hatuka wrote:
Hi Patrick,
We don't have the signal number yet in the telemetry crash reports.
You can see other crash events with the same signature here:
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=8032fa5f1f2107af12b68e6fe2586cc5fe2c1ad99a09d4b1bbfa47a813852039in the Crashes table:
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?viewPanel=26&orgId=1&var-sig_v2=8032fa5f1f2107af12b68e6fe2586cc5fe2c1ad99a09d4b1bbfa47a813852039If it seems like this is not a bug after all, you can change the status to Rejected or Won't Fix.
Sorry, how do you conclude it's not a bug after all?
#6 Updated by Yaarit Hatuka almost 3 years ago
oh, I mean in general, not necessarily in this case.
This was opened automatically by a telemetry-to-redmine bot that I'm working on, that will import the telemetry crash reports to redmine. It was opened with my user, but future issues will be opened with the telemetry bot user.
#7 Updated by Patrick Donnelly almost 3 years ago
- Project changed from CephFS to Ceph
- Subject changed from mds: crash in pthread_getname_np to *: crash in pthread_getname_np
- Target version set to v17.0.0
- Source changed from Telemetry to Q/A
- Backport set to pacific
2021-06-01T05:31:28.917 INFO:tasks.ceph.mon.a.smithi053.stderr:*** Caught signal (Segmentation fault) ** 2021-06-01T05:31:28.917 INFO:tasks.ceph.mon.a.smithi053.stderr: in thread 7f819c376700 thread_name:ceph-mon 2021-06-01T05:31:28.917 INFO:tasks.ceph.mon.a.smithi053.stderr: ceph version 16.2.4-225-gf9084200 (f908420004cc81a30edb2b252b4d92f50c526280) pacific (stable) 2021-06-01T05:31:28.917 INFO:tasks.ceph.mon.a.smithi053.stderr: 1: /lib64/libpthread.so.0(+0x12dc0) [0x7f8191010dc0] 2021-06-01T05:31:28.918 INFO:tasks.ceph.mon.a.smithi053.stderr: 2: pthread_getname_np() 2021-06-01T05:31:28.918 INFO:tasks.ceph.mon.a.smithi053.stderr: 3: (ceph::logging::Log::dump_recent()+0x4b3) [0x7f81938845a3] 2021-06-01T05:31:28.918 INFO:tasks.ceph.mon.a.smithi053.stderr: 4: ceph-mon(+0x53110b) [0x5594d213110b] 2021-06-01T05:31:28.919 INFO:tasks.ceph.mon.a.smithi053.stderr: 5: /lib64/libpthread.so.0(+0x12dc0) [0x7f8191010dc0] 2021-06-01T05:31:28.919 INFO:tasks.ceph.mon.a.smithi053.stderr: 6: gsignal() 2021-06-01T05:31:28.919 INFO:tasks.ceph.mon.a.smithi053.stderr: 7: abort() 2021-06-01T05:31:28.920 INFO:tasks.ceph.mon.a.smithi053.stderr: 8: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f819351059d] 2021-06-01T05:31:28.920 INFO:tasks.ceph.mon.a.smithi053.stderr: 9: /usr/lib64/ceph/libceph-common.so.2(+0x276766) [0x7f8193510766] 2021-06-01T05:31:28.920 INFO:tasks.ceph.mon.a.smithi053.stderr: 10: (Monitor::~Monitor()+0xb35) [0x5594d1ee0995] 2021-06-01T05:31:28.921 INFO:tasks.ceph.mon.a.smithi053.stderr: 11: (Monitor::~Monitor()+0xd) [0x5594d1ee09ed] 2021-06-01T05:31:28.921 INFO:tasks.ceph.mon.a.smithi053.stderr: 12: main() 2021-06-01T05:31:28.921 INFO:tasks.ceph.mon.a.smithi053.stderr: 13: __libc_start_main() 2021-06-01T05:31:28.922 INFO:tasks.ceph.mon.a.smithi053.stderr: 14: _start() 2021-06-01T05:31:28.980 INFO:tasks.ceph.mgr.z.smithi179.stderr:daemon-helper: command crashed with signal 15
From: /ceph/teuthology-archive/teuthology-2021-06-01_04:17:03-fs-pacific-distro-basic-smithi/6144511/teuthology.log
Test failed for other reasons but we finally have this showing up in QA.
#8 Updated by Neha Ojha almost 3 years ago
- Project changed from Ceph to RADOS
#9 Updated by Telemetry Bot about 2 years ago
#10 Updated by Telemetry Bot about 2 years ago
- Crash signature (v1) updated (diff)
- Affected Versions v15.2.10, v15.2.11, v15.2.12, v15.2.13, v15.2.15, v15.2.4, v15.2.5, v15.2.7, v15.2.8, v16.2.4, v16.2.5, v16.2.6, v16.2.7 added
#11 Updated by Radoslaw Zarzynski almost 2 years ago
- Crash signature (v1) updated (diff)
Still Need More Info as the logs aren't there after the months.