Project

General

Profile

Actions

Bug #21612

open

mgr: thread create assert failure when racing with respawn

Added by Yuri Weinstein over 6 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
ceph-mgr
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2017-09-26_02:30:02-rados-luminous-distro-basic-smithi/
Job: 1673575
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2017-09-26_02:30:02-rados-luminous-distro-basic-smithi/1673575/teuthology.log

2017-09-30T21:49:00.167 INFO:tasks.workunit.client.0.smithi066.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephtool/test.sh:744: test_mon_misc:  ceph mgr module disable foodne
2017-09-30T21:49:00.175 INFO:tasks.ceph.mgr.x.smithi066.stderr:2017-09-30 21:49:00.165413 7f7b51fd7700 -1 mgr load Module not found: 'foodne'
2017-09-30T21:49:00.183 INFO:tasks.ceph.mgr.x.smithi066.stderr:2017-09-30 21:49:00.165418 7f7b51fd7700 -1 mgr load ImportError: No module named foodne
2017-09-30T21:49:00.215 INFO:tasks.ceph.mgr.x.smithi066.stderr:
2017-09-30T21:49:00.235 INFO:tasks.ceph.mgr.x.smithi066.stderr:2017-09-30 21:49:00.165463 7f7b51fd7700 -1 mgr init Error loading module 'foodne': (2) No such file or directory
2017-09-30T21:49:00.292 INFO:tasks.ceph.mgr.x.smithi066.stderr:2017-09-30 21:49:00.287052 7f7b51fd7700 -1 log_channel(cluster) log [ERR] : Failed to load ceph-mgr modules: foodne
2017-09-30T21:49:01.298 INFO:tasks.ceph.mgr.x.smithi066.stderr:Thread::try_create(): pthread_create failed with error 11/build/ceph-12.2.0-296-g63ce514/src/common/Thread.cc: In function 'void Thread::create(const char*, size_t)' thread 7f7b4ffd3700 time 2017-09-30 21:49:01.293398
2017-09-30T21:49:01.308 INFO:tasks.ceph.mgr.x.smithi066.stderr:/build/ceph-12.2.0-296-g63ce514/src/common/Thread.cc: 152: FAILED assert(ret == 0)
2017-09-30T21:49:01.315 INFO:tasks.ceph.mgr.x.smithi066.stderr:2017-09-30 21:49:01.293198 7f7b55fdf700 -1 mgr got_mgr_map mgrmap module list changed to (restful,status), respawn2017-09-30 21:49:01.310774 7f34abcc7440 -1 WARNING: all dangerous and experimental features are enabled.
Actions #1

Updated by Sage Weil over 6 years ago

this looks like a race between thread creates and respawn()

2017-09-30 21:49:01.293182 7f7b55fdf700  4 mgr handle_mgr_map received map epoch 8
2017-09-30 21:49:01.293185 7f7b55fdf700  4 mgr handle_mgr_map active in map: 1 active is 4764
2017-09-30 21:49:01.293187 7f7b55fdf700 10 mgr handle_mgr_map I was already active
2017-09-30 21:49:01.293190 7f7b55fdf700 10 mgr got_mgr_map x(active)
2017-09-30 21:49:01.310973 7f34abcc7440  0 ceph version 12.2.0-296-g63ce514 (63ce514631ab0106ebd242f0ece43409dc83f479) luminous (stable), process (unknown), pid 17875
2017-09-30 21:49:01.312338 7f34abcc7440 -1 WARNING: all dangerous and experimental features are enabled.
2017-09-30 21:49:01.312677 7f34abcc7440  1 -- - messenger.start
2017-09-30 21:49:01.312739 7f34abcc7440  5 adding auth protocol: cephx
2017-09-30 21:49:01.312847 7f34abcc7440  2 auth: KeyRing::load: loaded key file /var/lib/ceph/mgr/ceph-x/keyring
2017-09-30 21:49:01.312971 7f34abcc7440  1 -- - --> 172.21.15.66:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 0x7f34b4ec4c80 con 0x7f34b4ef2800

seems.. harmless? :/

Actions #2

Updated by Sage Weil over 6 years ago

  • Project changed from Ceph to mgr
  • Subject changed from "FAILED assert(ret == 0)" in rados run to mgr: thread create assert failure when racing with respawn
  • Status changed from New to 12
Actions #3

Updated by John Spray over 6 years ago

  • Category set to ceph-mgr
Actions #4

Updated by Neha Ojha almost 6 years ago

2018-05-29T20:37:12.702 INFO:tasks.ceph.mgr.x.smithi198.stderr:Thread::try_create(): pthread_create failed with error 11/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.0.0-1-g1706350/rpm/el7/BUILD/ceph-14.0.0-1-g1706350/src/common/Thread.cc: In function 'void Thread::create(const char*, size_t)' thread 7fe26d198700 time 2018-05-29 20:37:12.705385
2018-05-29T20:37:12.703 INFO:tasks.ceph.mgr.x.smithi198.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.0.0-1-g1706350/rpm/el7/BUILD/ceph-14.0.0-1-g1706350/src/common/Thread.cc: 153: FAILED assert(ret == 0)

http://pulpito.ceph.com/nojha-2018-05-29_18:24:22-rados-wip-async-up-distro-basic-smithi/2605132/

Actions #5

Updated by Neha Ojha almost 6 years ago

/a/yuriw-2018-07-10_19:30:47-rados-wip-yuri2-testing-2018-07-10-1606-mimic-distro-basic-smithi/2763330/

Actions #6

Updated by Neha Ojha almost 5 years ago

/a/yuriw-2019-04-30_20:30:11-rados-wip-yuri4-testing-2019-04-30-1634-mimic-distro-basic-smithi/3912676/

Actions #7

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 12 to New
Actions

Also available in: Atom PDF