Actions
Bug #12088
closedcephfs client crash after enable readahead mechanism through setting conf option 'client_readahead_max_bytes'
% Done:
0%
Source:
other
Tags:
Backport:
hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I run fio tool to test the randread performance of cephfs. Ceph client will crash, when I enable readahead on ceph fuse client.
The bug happened when I run randread tests with 4 threads.
Our ceph.conf:
client_readahead_max_bytes = 1048576
The core dump (gdb) bt #0 0x00007fc65540236b in raise () from /lib64/libpthread.so.0 #1 0x0000000000632606 in reraise_fatal (signum=6) at global/signal_handler.cc:59 #2 handle_fatal_signal (signum=6) at global/signal_handler.cc:109 #3 <signal handler called> #4 0x00007fc6544e9885 in raise () from /lib64/libc.so.6 #5 0x00007fc6544eb065 in abort () from /lib64/libc.so.6 #6 0x00007fc654d9fa7d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6 #7 0x00007fc654d9dc06 in ?? () from /usr/lib64/libstdc++.so.6 #8 0x00007fc654d9dc33 in std::terminate() () from /usr/lib64/libstdc++.so.6 #9 0x00007fc654d9dd2e in __cxa_throw () from /usr/lib64/libstdc++.so.6 #10 0x000000000064bcca in ceph::__ceph_assert_fail (assertion=0x7fc5d40a30b0 "&\254\203U>\314\211#", file=<value optimized out>, line=68661056, func=0x89a5d0 "void Mutex::Lock(bool)") at common/assert.cc:77 #11 0x0000000000635d33 in Mutex::Lock (this=0x7fc57098bab0, no_lockdep=<value optimized out>) at common/Mutex.cc:95 #12 0x0000000000687f2e in Readahead::dec_pending (this=0x7fc57098b9f0, count=1) at common/Readahead.cc:122 #13 0x00000000005723c8 in Client::C_Readahead::finish (this=0x7fc570977d40, r=<value optimized out>) at client/Client.cc:7044 #14 0x00000000005afc89 in Context::complete (this=0x7fc570977d40, r=<value optimized out>) at include/Context.h:65 #15 0x00000000005afc89 in Context::complete (this=0x7fc570008e60, r=<value optimized out>) at include/Context.h:65 #16 0x00000000006291bd in finish_contexts<Context> (cct=0x417af40, finished=<value optimized out>, result=0) at include/Context.h:121 #17 0x00000000006187df in ObjectCacher::bh_read_finish (this=0x419eb40, poolid=<value optimized out>, oid=<value optimized out>, tid=9290, start=0, length=131072, bl=..., r=0, trust_enoent=true) at osdc/ObjectCacher.cc:824 #18 0x0000000000629427 in ObjectCacher::C_ReadFinish::finish (this=0x7fc570173370, r=0) at osdc/ObjectCacher.h:512 #19 0x00000000005afc89 in Context::complete (this=0x7fc570173370, r=<value optimized out>) at include/Context.h:65 #20 0x00000000005b12f1 in C_Lock::finish (this=0x7fc570014620, r=0) at include/Context.h:162 #21 0x00000000005afc89 in Context::complete (this=0x7fc570014620, r=<value optimized out>) at include/Context.h:65 #22 0x000000000064a648 in Finisher::finisher_thread_entry (this=0x419be40) at common/Finisher.cc:59 #23 0x00007fc6553fa7f1 in start_thread () from /lib64/libpthread.so.0 #24 0x00007fc65459cccd in clone () from /lib64/libc.so.6 (gdb) frame 11 #11 0x0000000000635d33 in Mutex::Lock (this=0x7fc57098bab0, no_lockdep=<value optimized out>) at common/Mutex.cc:95 95 assert(r == 0); (gdb) l 90 start = ceph_clock_now(cct); 91 r = pthread_mutex_lock(&_m); 92 if (logger && cct && cct->_conf->mutex_perf_counter) 93 logger->tinc(l_mutex_wait, 94 ceph_clock_now(cct) - start); 95 assert(r == 0); 96 if (lockdep && g_lockdep) _locked(); 97 _post_lock(); 98 99 out: (gdb) print _m $1 = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>"\377, \377\377\377", '\000' <repeats 19 times>, __align = 0}
The coredump shows : the mutex lock is broken(size contains '\377').
The Ceph log
2015-06-19 12:36:10.360312 7fca117fa700 20 client.97564 readahead 19922944~131072 (caller wants 19906560~16384) 2015-06-19 12:36:10.360318 7fca117fa700 10 striper file_to_extents 19922944~131072 format 30000009691.%08llx 2015-06-19 12:36:10.360320 7fca117fa700 20 striper su 4194304 sc 1 os 4194304 stripes_per_object 1 2015-06-19 12:36:10.360324 7fca117fa700 20 striper off 19922944 blockno 4 stripeno 4 stripepos 0 objectsetno 4 objectno 4 block_start 0 block_off 3145728 3145728~131072 2015-06-19 12:36:10.360330 7fca117fa700 20 striper object_truncate_size 4 0->0 2015-06-19 12:36:10.360332 7fca117fa700 20 striper added new extent(30000009691.00000004 (4) in @3 3145728~131072 -> []) 2015-06-19 12:36:10.360335 7fca117fa700 15 striper file_to_extents extent(30000009691.00000004 (4) in @3 3145728~131072 -> [0,131072]) in @3 2015-06-19 12:36:10.360340 7fca117fa700 10 objectcacher readx extent(30000009691.00000004 (4) in @3 3145728~131072 -> [0,131072]) 2015-06-19 12:36:10.360345 7fca117fa700 10 objectcacher.object(30000009691.00000004/head) map_read 30000009691.00000004 3145728~131072 2015-06-19 12:36:10.360349 7fca117fa700 20 objectcacher.object(30000009691.00000004/head) map_read gap bh[ 0x7fc9d8002900 3145728~131072 0x7fc9d966bf10 (0) v 0 missing] waiters = {} 2015-06-19 12:36:10.360407 7fca117fa700 1 -- 10.****:0/65371 --> 10.****.95:6800/31915 -- osd_op(client.97564.0:79410 30000009691.00000004 [read 3145728~131072] 3.c9946208 ack+read+known_if_redirected e12226) v5 -- ?+0 0x7fc9d810d790 con 0x7fccc008a280 2015-06-19 12:36:10.360421 7fca117fa700 20 -- 10.****:0/65371 submit_message osd_op(client.97564.0:79410 30000009691.00000004 [read 3145728~131072] 3.c9946208 ack+read+known_if_redirected e12226) v5 remote, 10.****.95:6800/31915, have pipe. 2015-06-19 12:36:10.361266 7fc9cdfeb700 3 client.97564 ll_release (fh)0x7fc9d9655660 30000009691 2015-06-19 12:36:10.361270 7fc9cdfeb700 5 client.97564 *_release_fh* 0x7fc9d9655660 mode 1 on 30000009691.head(ref=4 ll_ref=7 cap_refs={1024=1,2048=1,4096=0,8192=0} open={1=1,2=0,3=0} mode=100644 size=20971520/0 mtime=2015-06-19 12:35:06.668295 caps=pAsLsXsFscr(2=pAsLsXsFscr) objectset[30000009691 ts 0/0 objects 5 dirty_or_tx 0] parents=0x7fca0cc56b80 0x7fccd002c6b0) 2015-06-19 12:36:10.369798 7fcabb8f8700 1 -- 10.****:0/65371 <== osd.44 10.****.95:6800/31915 1941 ==== osd_op_reply(79410 30000009691.00000004 [read 3145728~131072] v0'0 uv18386 ondisk = 0) v6 ==== 187+0+131072 (2385944524 0 3796294508) 0x7fca90030a70 con 0x7fccc008a280 2015-06-19 12:36:10.398019 7fccf0ce3700 7 objectcacher bh_read_finish 30000009691.00000004/head tid 38326 3145728~131072 (bl is 131072) returned 0 *outstanding reads 31* 2015-06-19 12:36:10.398058 7fccf0ce3700 10 objectcacher readx extent(30000009691.00000004 (4) in @3 3145728~131072 -> [0,131072]) 2015-06-19 12:36:10.398063 7fccf0ce3700 10 objectcacher.object(30000009691.00000004/head) map_read 30000009691.00000004 3145728~131072 2015-06-19 12:36:10.398066 7fccf0ce3700 20 objectcacher.object(30000009691.00000004/head) map_read hit bh[ 0x7fc9d957ac90 3112960~163840 0x7fc9d966bf10 (163840) v 0 clean firstbyte=0] waiters = {} 2015-06-19 12:36:10.398206 7fccf0ce3700 20 client.97564 C_Readahead on 0x159ffa4d55839c3a
As shown in ceph log, I think the reason is the file handler is released, outstanding reads don't finish yet. So when the outstanding read finish, it will access the broken Fh structure(ie. broken Mutex).
We use ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) in our I/O tests.
Files
Actions