Project

General

Profile

Actions

Bug #24047

closed

MDCache.cc: 5317: FAILED assert(mds->is_rejoin())

Added by Patrick Donnelly almost 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash, multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Assertion: /build/ceph-13.0.2-2151-g180ce3f/src/mds/MDCache.cc: 5317: FAILED assert(mds->is_rejoin())
ceph version 13.0.2-2151-g180ce3f (180ce3fb9ca95b195a595828062c76237435e6de) mimic (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f7d04b34172]
 2: (()+0x2e4337) [0x7f7d04b34337]
 3: (MDCache::rejoin_gather_finish()+0x197) [0x56429a9b9307]
 4: (MDSIOContextBase::complete(int)+0x119) [0x56429ab2bd89]
 5: (MDSLogContextBase::complete(int)+0x40) [0x56429ab2bf10]
 6: (Finisher::finisher_thread_entry()+0x12e) [0x7f7d04b3270e]
 7: (()+0x76ba) [0x7f7d043c96ba]
 8: (clone()+0x6d) [0x7f7d03bf241d]
1 jobs: ['2490999']
suites: ['conf.yaml', 'frag_enable.yaml', 'kcephfs/thrash/{clusters/1-mds-1-client.yaml', 'log-config.yaml', 'objectstore-ec/filestore-xfs.yaml', 'osd-asserts.yaml', 'overrides/{debug.yaml', 'thrashers/mds.yaml', 'thrashosds-health.yaml', 'whitelist_health.yaml', 'whitelist_wrongly_marked_down.yaml}', 'workloads/kclient_workunit_suites_iozone.yaml}']

and

2018-05-07T19:21:42.890 INFO:tasks.ceph.mds.a-s.smithi080.stderr:/build/ceph-13.0.2-2151-g180ce3f/src/mds/MDCache.cc: 5331: FAILED assert(rejoin_ack_gather.count(mds->get_nodeid()))
2018-05-07T19:21:42.890 INFO:tasks.ceph.mds.a-s.smithi080.stderr:
2018-05-07T19:21:42.890 INFO:tasks.ceph.mds.a-s.smithi080.stderr: ceph version 13.0.2-2151-g180ce3f (180ce3fb9ca95b195a595828062c76237435e6de) mimic (dev)
2018-05-07T19:21:42.890 INFO:tasks.ceph.mds.a-s.smithi080.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fc4e98f3172]
2018-05-07T19:21:42.890 INFO:tasks.ceph.mds.a-s.smithi080.stderr: 2: (()+0x2e4337) [0x7fc4e98f3337]
2018-05-07T19:21:42.890 INFO:tasks.ceph.mds.a-s.smithi080.stderr: 3: (MDCache::rejoin_gather_finish()+0xf0) [0x556bde757260]
2018-05-07T19:21:42.890 INFO:tasks.ceph.mds.a-s.smithi080.stderr: 4: (MDSIOContextBase::complete(int)+0x119) [0x556bde8c9d89]
2018-05-07T19:21:42.890 INFO:tasks.ceph.mds.a-s.smithi080.stderr: 5: (MDSLogContextBase::complete(int)+0x40) [0x556bde8c9f10]
2018-05-07T19:21:42.891 INFO:tasks.ceph.mds.a-s.smithi080.stderr: 6: (Finisher::finisher_thread_entry()+0x12e) [0x7fc4e98f170e]
2018-05-07T19:21:42.891 INFO:tasks.ceph.mds.a-s.smithi080.stderr: 7: (()+0x76ba) [0x7fc4e91886ba]
2018-05-07T19:21:42.891 INFO:tasks.ceph.mds.a-s.smithi080.stderr: 8: (clone()+0x6d) [0x7fc4e89b141d]
2018-05-07T19:21:42.891 INFO:tasks.ceph.mds.a-s.smithi080.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

From: /ceph/teuthology-archive/pdonnell-2018-05-07_17:33:46-kcephfs-wip-pdonnell-testing-20180504.211521-testing-basic-smithi/2490999/teuthology.log

Also these suites: 12 jobs: ['2490972', '2491004', '2490984', '2490981', '2491012', '2491007', '2490991', '2490974', '2490988', '2491006', '2490997', '2490986']


Related issues 2 (0 open2 closed)

Has duplicate CephFS - Bug #23826: mds: assert after daemon restartDuplicatePatrick Donnelly04/23/2018

Actions
Copied to CephFS - Backport #24108: luminous: MDCache.cc: 5317: FAILED assert(mds->is_rejoin())ResolvedZheng YanActions
Actions #1

Updated by Patrick Donnelly almost 6 years ago

  • Status changed from New to Fix Under Review
Actions #2

Updated by Patrick Donnelly almost 6 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #24108: luminous: MDCache.cc: 5317: FAILED assert(mds->is_rejoin()) added
Actions #4

Updated by Patrick Donnelly almost 6 years ago

  • Has duplicate Bug #23826: mds: assert after daemon restart added
Actions #5

Updated by Nathan Cutler almost 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF