Project

General

Profile

Bug #23600

assert(0 == "BUG!") attached in EventCenter::create_file_event

Added by Jason liu almost 6 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
core,msg/async,posix
Backport:
luminous,mimic
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

when I stop and start the osd, I occasionally trigger this error. The environment is as follows:1 MON, 2 MDSs,30 OSDs. I got the backtrace as following:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 7 --setuser ceph --setgroup ceph'.
Program terminated with signal 6, Aborted.
#0 0x00007fe143cddfcb in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
37 return INLINE_SYSCALL (tgkill, 3, pid, THREAD_GETMEM (THREAD_SELF, tid),
Missing separate debuginfos, use: debuginfo-install boost-iostreams-1.53.0-25.el7.x86_64 boost-random-1.53.0-25.el7.x86_64 boost-system-1.53.0-25.el7.x86_64 boost-thread-1.53.0-25.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 cryptopp-5.6.2-9.el7.x86_64 fuse-libs-2.9.2-6.el7.x86_64 leveldb-1.12.0-5.el7.x86_64 libaio-0.3.109-13.el7.x86_64 libblkid-2.23.2-26.el7.x86_64 libgcc-4.8.5-4.el7.x86_64 libibverbs-1.1.8mlnx1-OFED.3.2.1.5.0.32200.x86_64 libnl-1.1.4-3.el7.x86_64 librdmacm-1.0.21mlnx-OFED.3.1.1.5.5.32200.x86_64 libstdc++-4.8.5-4.el7.x86_64 libuuid-2.23.2-26.el7.x86_64 lttng-ust-2.4.1-1.el7.x86_64 snappy-1.1.0-3.el7.x86_64 userspace-rcu-0.7.16-1.el7.x86_64 zlib-1.2.7-15.el7.x86_64
(gdb) bt
#0 0x00007fe143cddfcb in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1 0x00007fe1455af383 in reraise_fatal (signum=signum@entry=6) at global/signal_handler.cc:170
#2 0x00007fe1455afbeb in handle_fatal_signal (signum=6) at global/signal_handler.cc:250
#3 <signal handler called>
#4 0x00007fe1417115f7 in GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5 0x00007fe141712ce8 in _GI_abort () at abort.c:90
#6 0x00007fe1456cfa65 in ceph::
_ceph_assert_fail (assertion=assertion@entry=0x7fe1459b7de4 "0 == \"BUG!\"", file=file@entry=0x7fe1459b7d5a "msg/async/Event.cc", line=line@entry=236,
func=func@entry=0x7fe1459b8280 <EventCenter::create_file_event(int, int, EventCallback*)::__PRETTY_FUNCTION
> "int EventCenter::create_file_event(int, int, EventCallbackRef)") at common/assert.cc:80
#7 0x00007fe1457cd6b4 in EventCenter::create_file_event (this=0x7fe14f7e01b0, fd=9, mask=mask@entry=1, ctxt=0x7fe14f840c50) at msg/async/Event.cc:236
#8 0x00007fe1457aa6db in operator() (_closure=0x7ffd1d049ce8) at msg/async/AsyncMessenger.cc:154
#9 EventCenter::C_submit_event<Processor::start()::
_lambda3>::do_request(uint64_t) (this=0x7ffd1d049c80, id=<optimized out>) at msg/async/Event.h:229
#10 0x00007fe1457cc645 in EventCenter::process_events (this=this@entry=0x7fe14f7e01b0, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000) at msg/async/Event.cc:445
#11 0x00007fe145814dfa in NetworkStack::__lambda0::operator() (__closure=0x7fe14f82a140) at msg/async/Stack.cc:76
#12 0x00007fe14206a220 in ?? () from /usr/lib64/libstdc++.so.6
#13 0x00007fe143cd6dc5 in start_thread (arg=0x7fe13f633700) at pthread_create.c:308
#14 0x00007fe1417d221d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb)


Related issues

Copied to Messengers - Backport #40383: luminous: assert(0 == "BUG!") attached in EventCenter::create_file_event Rejected
Copied to Messengers - Backport #40384: mimic: assert(0 == "BUG!") attached in EventCenter::create_file_event Rejected

History

#1 Updated by Kefu Chai almost 6 years ago

  • Status changed from New to Fix Under Review
  • Backport set to luminous

#2 Updated by Sage Weil about 5 years ago

  • Backport changed from luminous to luminous,mimic

#3 Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to Messengers
  • Category deleted (msgr)

#5 Updated by xie xingguo almost 5 years ago

  • Status changed from Fix Under Review to Pending Backport

#6 Updated by Nathan Cutler almost 5 years ago

  • Copied to Backport #40383: luminous: assert(0 == "BUG!") attached in EventCenter::create_file_event added

#7 Updated by Nathan Cutler almost 5 years ago

  • Copied to Backport #40384: mimic: assert(0 == "BUG!") attached in EventCenter::create_file_event added

#8 Updated by Nathan Cutler about 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

#9 Updated by Peter Lieven over 2 years ago

Looking at the source it seems that the fix was never merged into luminous, mimic or nautilus. It was merged into 15.1.0 first.
Given the severity it might be a reason to queue it at least for Nautilus if there will ever be a 14.2.23.

#10 Updated by Radoslaw Zarzynski over 2 years ago

Peter Lieven wrote:

Looking at the source it seems that the fix was never merged into luminous, mimic or nautilus. It was merged into 15.1.0 first.
Given the severity it might be a reason to queue it at least for Nautilus if there will ever be a 14.2.23.

yeah, it's strange. The backport tickets are created and linked.
However, the backports have been rejected.

Nathan, do you remember maybe what was the reason?

#11 Updated by Peter Lieven over 2 years ago

I meanwhile opened a backport PR for Nautilus

https://github.com/ceph/ceph/pull/44043

Also available in: Atom PDF