Project

General

Profile

Actions

Bug #18119

closed

mds: check and get latest current logsegment to avoid trimming logsegment crashing

Added by Zhi Zhang over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2016-10-25 15:30:28.091007 7f5dea451700  0 log_channel(cluster) log [WRN] : 2 slow requests, 2 included below; oldest blocked for > 33.587701 secs
2016-10-25 15:30:28.091014 7f5dea451700  0 log_channel(cluster) log [WRN] : slow request 33.587701 seconds old, received at 2016-10-25 15:29:54.503267: client_request(client.814183:8220804 getattr Fs #1000344f17c) currently failed to rdlock, waiting
2016-10-25 15:30:28.091019 7f5dea451700  0 log_channel(cluster) log [WRN] : slow request 32.195154 seconds old, received at 2016-10-25 15:29:55.895815: client_request(client.814183:8220828 create #1000344edd9/global.tmp) currently submit entry: journal_and_reply
2016-10-25 15:30:33.367536 7f5dea451700  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 31.659875 secs
2016-10-25 15:30:33.367542 7f5dea451700  0 log_channel(cluster) log [WRN] : slow request 31.659875 seconds old, received at 2016-10-25 15:30:01.707626: client_request(client.217268:21826542 rename #1000248527a/tmp.127.0.0.1 #1000000005f/tmp.127.0.0.1 2016-10-25 15:30:01.639594) currently submit entry: journal_and_reply
2016-10-25 15:30:38.836972 7f5dea451700 -1 ./include/elist.h: In function 'elist<T>::~elist() [with T = CDentry*]' thread 7f5dea451700 time 2016-10-25 15:30:38.834810
./include/elist.h: 92: FAILED assert(_head.empty())

 ceph version 0.94.9-166-gbe32c81 (be32c819f9fa44b1abe168595a7b57b5833fe6d3)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x970ac5]
 2: (LogSegment::~LogSegment()+0x29a) [0x7f0a1a]
 3: (MDLog::_trim_expired_segments()+0x158) [0x7e8558]
 4: (MDLog::trim(int)+0x1e8) [0x7e8b18]
 5: (MDS::tick()+0x63c) [0x5a434c]
 6: (MDSInternalContextBase::complete(int)+0x16b) [0x7dd30b]
 7: (SafeTimer::timer_thread()+0x104) [0x961584]
 8: (SafeTimerThread::entry()+0xd) [0x96253d]
 9: (()+0x7df3) [0x7f5df122ddf3]
 10: (clone()+0x6d) [0x7f5deff141bd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Although this issue happened very rare, we met this few times when MDS/OSD had very heavy load and MDS reported "Behind on trimming" warning and slow requests, like "journal_and_reply".

After deep-diving, the root cause should be like this:

When MDS was processing a request, mdr->ls would be set to current log segment (let's say, ls_1) before creating a callback and doing journal_and_reply. But when the load was very heavy on OSD side, journal request to OSD would become very slow.

At this time, MDS would start to trim mdlog in tick and this log segment ls_1 could become an expiring one and do try_expire. Since try_expire would asynchronously commit dirty stuff in this log segment, this log segment ls_1 would become an expired one in C_MaybeExpiredSegment callback.

If the journal request came back right after ls_1 became expired but before next tick started, the request's callback would still use this log segment ls_1, so ls_1 would be inserted some dirty stuff.

Then next tick started and tried to do trim_expired_segments on ls_1. So crash happened.

Actions #1

Updated by Zhi Zhang over 7 years ago

https://github.com/ceph/ceph/pull/12277

The fix here is to check whether mdr->ls/mut->ls is expired or not in callback. If expired, obtain the current one. We have tested this change for about 3 weeks and not met this issue again.

Actions #2

Updated by Greg Farnum over 7 years ago

  • Status changed from New to Closed

This does not seem to be an issue in either master nor Jewel.

Actions

Also available in: Atom PDF