Project

General

Profile

Bug #41147

mds: crash loop - Server.cc 6835: FAILED ceph_assert(in->first <= straydn->first)

Added by super xor 3 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
Start date:
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
nautilus,mimic
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Crash signature:

Description

After creating a new FS and running it for 2 days I my MDS is in a crash loop. I didn't try anything yet so far as to purge session table or something similar.
Long log attached.

2019-08-07 06:27:26.258 7fc235bfb700  1 mds.0.81085 cluster recovered.
2019-08-07 06:27:42.870 7fc2313f2700 -1 /build/ceph-14.2.2/src/mds/Server.cc: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7fc2313f2700 time 2019-08-07 06:27:42.871973
/build/ceph-14.2.2/src/mds/Server.cc: 6835: FAILED ceph_assert(in->first <= straydn->first)

 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7fc23e465bb2]
 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7fc23e465d8d]
 3: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x17ab) [0x55ee0b]
 4: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0xb66) [0x564076]
 5: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xc55) [0x579c95]
 6: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x50c) [0x57a3bc]
 7: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0xf2) [0x587822]
 8: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message const> const&)+0x73c) [0x4f123c]
 9: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x6fb) [0x4f3b6b]
 10: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x12) [0x4f4452]
 11: (MDSContext::complete(int)+0x73) [0x7955c3]
 12: (MDSRank::_advance_queues()+0xb7) [0x4f2be7]
 13: (MDSRank::ProgressThread::entry()+0x43) [0x4f3333]
 14: (()+0x76ba) [0x7fc23dcfd6ba]
 15: (clone()+0x6d) [0x7fc23d52641d]
 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
 1: (()+0x11390) [0x7fc23dd07390]
 2: (gsignal()+0x38) [0x7fc23d454428]
 3: (abort()+0x16a) [0x7fc23d45602a]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a3) [0x7fc23e465c03]
 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7fc23e465d8d]
 6: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x17ab) [0x55ee0b]
 7: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0xb66) [0x564076]
 8: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xc55) [0x579c95]
 9: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x50c) [0x57a3bc]
 10: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0xf2) [0x587822]
 11: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message const> const&)+0x73c) [0x4f123c]
 12: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x6fb) [0x4f3b6b]
 13: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x12) [0x4f4452]
 14: (MDSContext::complete(int)+0x73) [0x7955c3]
 15: (MDSRank::_advance_queues()+0xb7) [0x4f2be7]
 16: (MDSRank::ProgressThread::entry()+0x43) [0x4f3333]
 17: (()+0x76ba) [0x7fc23dcfd6ba]
 18: (clone()+0x6d) [0x7fc23d52641d]

mdslog (58.3 KB) super xor, 08/07/2019 04:33 AM

History

#1 Updated by super xor 3 months ago

this is temporarily fixed by wiping session table

#2 Updated by Patrick Donnelly 3 months ago

  • Subject changed from MDS crash loop - Server.cc 6835: FAILED ceph_assert(in->first <= straydn->first) to mds: crash loop - Server.cc 6835: FAILED ceph_assert(in->first <= straydn->first)
  • Category set to Correctness/Safety
  • Start date deleted (08/07/2019)
  • Source set to Community (user)
  • Backport set to nautilus,mimic

#3 Updated by Patrick Donnelly 3 months ago

  • Target version changed from v14.2.2 to v15.0.0
  • Affected Versions v14.2.2 added

#4 Updated by super xor about 2 months ago

happend again

Also available in: Atom PDF