Actions
Bug #19033
closedcephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
jewel, kraken
Regression:
No
Severity:
1 - critical
Reviewed:
ceph-qa-suite:
Component(FS):
Client, MDS, osdc
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
1. Problem¶
After I have set about 400 64KB xattr kv pair to a file,
mds is crashed. Every time I try to start mds, it will crash again.
bt:
Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7f9850be8700 (LWP 24280)]
0x00007f98591155d7 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007f98591155d7 in raise () from /lib64/libc.so.6
#1 0x00007f9859116cc8 in abort () from /lib64/libc.so.6
#2 0x00007f985c926094 in ceph::__ceph_assert_fail (assertion=0x7f985cbca728 "write_buf.length() == write_pos - flush_pos", file=0x7f985cbca0b3 "osdc/Journaler.cc", line=674,
func=0x7f985cbcc1c0 <Journaler::_do_flush(unsigned int)::__PRETTY_FUNCTION__> "void Journaler::_do_flush(unsigned int)") at common/assert.cc:78
#3 0x00007f985c6c012d in Journaler::_do_flush (this=0x7f986801cd00, amount=0) at osdc/Journaler.cc:674
#4 0x00007f985c6c1f17 in Journaler::_finish_prezero (this=0x7f986801cd00, r=-2, start=54525952, len=4194304) at osdc/Journaler.cc:846
#5 0x00007f985c6cf148 in C_Journaler_Prezero::finish (this=0x7f98684a8040, r=-2) at osdc/Journaler.cc:771
#6 0x00007f985c2a44c7 in Context::complete (this=0x7f98684a8040, r=-2) at include/Context.h:64
#7 0x00007f985c7bcf74 in Finisher::finisher_thread_entry (this=0x7f9867fdf760) at common/Finisher.cc:76
#8 0x00007f985c2ec120 in Finisher::FinisherThread::entry (this=0x7f9867fdf880) at common/Finisher.h:66
#9 0x00007f985c90ec1d in Thread::entry_wrapper (this=0x7f9867fdf880) at common/Thread.cc:87
#10 0x00007f985c90eb52 in Thread::_entry_func (arg=0x7f9867fdf880) at common/Thread.cc:67
#11 0x00007f985a708df5 in start_thread () from /lib64/libpthread.so.0
#12 0x00007f98591d61ad in clone () from /lib64/libc.so.6
(gdb) l osdc/Journaler.cc:674
669 flush_pos, len, write_bl, ceph::real_clock::now(cct),
670 0,
671 NULL, wrap_finisher(onsafe), write_iohint);
672
673 flush_pos += len;
674 assert(write_buf.length() == write_pos - flush_pos);
675
676 ldout(cct, 10)
677 << "_do_flush (prezeroing/prezero)/write/flush/safe pointers now at "
678 << "(" << prezeroing_pos << "/" << prezero_pos << ")/" << write_pos
(gdb) up 3
#3 0x00007f985c6c012d in Journaler::_do_flush (this=0x7f986801cd00, amount=0) at osdc/Journaler.cc:674
674 assert(write_buf.length() == write_pos - flush_pos);
(gdb) p write_pos
$57 = 5989821898
(gdb) p flush_pos
$58 = 83886080
(gdb) p write_buf.length()
$59 = 1610968522 // 0x 60056dca
(gdb) p (write_pos - flush_pos) |||||||||
$60 = 5905935818 // 0x160056dca
2. Reason¶
The root reason is write_buf._len overflowed in the following call path.
MDLog::_submit_thread() -> Journaler::append_entry() -> JournalStream::write() -> buffer::list::claim_append()
Because there is no overflow checing in buffer::list::claim_append, this overflow is hidden.
Until Journaler::_do_flush() is called, this error is captured.
- first overflow checking
::
// flush
unsigned len = write_pos - flush_pos; <---------------<<<< overflow can not be detected because len is 32bit
assert(len == write_buf.length());
- len is asigned to newlen
::
if (newlen < len) {
ldout(cct, 10) << "_do_flush wanted to do " << flush_pos << "~" << len
<< " but hit prezero_pos " << prezero_pos
<< ", will do " << flush_pos << "~" << newlen << dendl;
len = newlen;
- write_buf.splice is called
::
// adjust pointers
if (len == write_buf.length()) {
write_bl.swap(write_buf);
} else {
write_buf.splice(0, len, &write_bl);
}
- overflow is detected
::
filer.write(ino, &layout, snapc,
flush_pos, len, write_bl, ceph::real_clock::now(),
0,
wrap_finisher(onsafe), write_iohint);
flush_pos += len;
assert(write_buf.length() == write_pos - flush_pos); <-------------<<<< assert failed
3. Fix proposal¶
- add bufferlist's _len overflow checking
- stop appending entry to a nearly overflow write_buf
4. Bug reproduce¶
#include <sys/types.h>
#include <sys/xattr.h>
#include <stdio.h>
#define LEN (64*1024)
int main(int argc, const char* argv)
{
char val[LEN];
int i = 0;
for (; i < LEN; i++) {
val[i] = 'X';
}
int j = 0;
for (; j < 1000; j ++) {
char tmp[50];
snprintf(tmp, 50, "%s%d", "trusted.verybigkey", j);
printf("set xattr: %s\n", tmp);
int ret = setxattr("/mnt/mycephfs/test6/mmm", tmp, val, sizeof(val), 0);
if (ret < 0) {
perror("setxattr failed\n");
} else {
printf("setxattr success\n");
}
}
return 0;
}
Updated by Honggang Yang about 7 years ago
Updated by John Spray about 7 years ago
- Status changed from New to Fix Under Review
Updated by John Spray about 7 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to jewel, kraken
Updated by Nathan Cutler about 7 years ago
- Copied to Backport #19673: jewel: cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file added
Updated by Nathan Cutler about 7 years ago
- Copied to Backport #19674: kraken: cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file added
Updated by Nathan Cutler almost 7 years ago
- Status changed from Pending Backport to Resolved
Actions