Bug #12088: cephfs client crash after enable readahead mechanism through setting conf option 'client_readahead_max_bytes' - CephFS - Ceph

Actions

Copy link

Bug #12088

closed

cephfs client crash after enable readahead mechanism through setting conf option 'client_readahead_max_bytes'

Added by Yan Shen almost 9 years ago. Updated almost 9 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Zheng Yan

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

hammer

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v0.94.3

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I run fio tool to test the randread performance of cephfs. Ceph client will crash, when I enable readahead on ceph fuse client.
The bug happened when I run randread tests with 4 threads.

Our ceph.conf:
client_readahead_max_bytes = 1048576

The core dump
(gdb) bt
#0  0x00007fc65540236b in raise () from /lib64/libpthread.so.0
#1  0x0000000000632606 in reraise_fatal (signum=6) at global/signal_handler.cc:59
#2  handle_fatal_signal (signum=6) at global/signal_handler.cc:109
#3  <signal handler called>
#4  0x00007fc6544e9885 in raise () from /lib64/libc.so.6
#5  0x00007fc6544eb065 in abort () from /lib64/libc.so.6
#6  0x00007fc654d9fa7d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib64/libstdc++.so.6
#7  0x00007fc654d9dc06 in ?? () from /usr/lib64/libstdc++.so.6
#8  0x00007fc654d9dc33 in std::terminate() () from /usr/lib64/libstdc++.so.6
#9  0x00007fc654d9dd2e in __cxa_throw () from /usr/lib64/libstdc++.so.6
#10 0x000000000064bcca in ceph::__ceph_assert_fail (assertion=0x7fc5d40a30b0 "&\254\203U>\314\211#", file=<value optimized out>, 
    line=68661056, func=0x89a5d0 "void Mutex::Lock(bool)") at common/assert.cc:77
#11 0x0000000000635d33 in Mutex::Lock (this=0x7fc57098bab0, no_lockdep=<value optimized out>) at common/Mutex.cc:95
#12 0x0000000000687f2e in Readahead::dec_pending (this=0x7fc57098b9f0, count=1) at common/Readahead.cc:122
#13 0x00000000005723c8 in Client::C_Readahead::finish (this=0x7fc570977d40, r=<value optimized out>) at client/Client.cc:7044
#14 0x00000000005afc89 in Context::complete (this=0x7fc570977d40, r=<value optimized out>) at include/Context.h:65
#15 0x00000000005afc89 in Context::complete (this=0x7fc570008e60, r=<value optimized out>) at include/Context.h:65
#16 0x00000000006291bd in finish_contexts<Context> (cct=0x417af40, finished=<value optimized out>, result=0)
    at include/Context.h:121
#17 0x00000000006187df in ObjectCacher::bh_read_finish (this=0x419eb40, poolid=<value optimized out>, oid=<value optimized out>, 
    tid=9290, start=0, length=131072, bl=..., r=0, trust_enoent=true) at osdc/ObjectCacher.cc:824
#18 0x0000000000629427 in ObjectCacher::C_ReadFinish::finish (this=0x7fc570173370, r=0) at osdc/ObjectCacher.h:512
#19 0x00000000005afc89 in Context::complete (this=0x7fc570173370, r=<value optimized out>) at include/Context.h:65
#20 0x00000000005b12f1 in C_Lock::finish (this=0x7fc570014620, r=0) at include/Context.h:162
#21 0x00000000005afc89 in Context::complete (this=0x7fc570014620, r=<value optimized out>) at include/Context.h:65
#22 0x000000000064a648 in Finisher::finisher_thread_entry (this=0x419be40) at common/Finisher.cc:59
#23 0x00007fc6553fa7f1 in start_thread () from /lib64/libpthread.so.0
#24 0x00007fc65459cccd in clone () from /lib64/libc.so.6
(gdb) frame 11
#11 0x0000000000635d33 in Mutex::Lock (this=0x7fc57098bab0, no_lockdep=<value optimized out>) at common/Mutex.cc:95
95        assert(r == 0);
(gdb) l
90          start = ceph_clock_now(cct);
91        r = pthread_mutex_lock(&_m);
92        if (logger && cct && cct->_conf->mutex_perf_counter)
93          logger->tinc(l_mutex_wait,
94                       ceph_clock_now(cct) - start);
95        assert(r == 0);
96        if (lockdep && g_lockdep) _locked();
97        _post_lock();
98
99      out:
(gdb) print _m
$1 = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = -1, __spins = 0, __list = {__prev = 0x0, 
      __next = 0x0}}, __size = '\000' <repeats 16 times>"\377, \377\377\377", '\000' <repeats 19 times>, __align = 0}

The coredump shows : the mutex lock is broken(size contains '\377').

The Ceph log


2015-06-19 12:36:10.360312 7fca117fa700 20 client.97564 readahead 19922944~131072 (caller wants 19906560~16384)
2015-06-19 12:36:10.360318 7fca117fa700 10 striper file_to_extents 19922944~131072 format 30000009691.%08llx
2015-06-19 12:36:10.360320 7fca117fa700 20 striper  su 4194304 sc 1 os 4194304 stripes_per_object 1
2015-06-19 12:36:10.360324 7fca117fa700 20 striper  off 19922944 blockno 4 stripeno 4 stripepos 0 objectsetno 4 objectno 4 block_start 0 block_off 3145728 3145728~131072
2015-06-19 12:36:10.360330 7fca117fa700 20 striper object_truncate_size 4 0->0
2015-06-19 12:36:10.360332 7fca117fa700 20 striper  added new extent(30000009691.00000004 (4) in @3 3145728~131072 -> [])
2015-06-19 12:36:10.360335 7fca117fa700 15 striper file_to_extents  extent(30000009691.00000004 (4) in @3 3145728~131072 -> [0,131072]) in @3

2015-06-19 12:36:10.360340 7fca117fa700 10 objectcacher readx extent(30000009691.00000004 (4) in @3 3145728~131072 -> [0,131072])
2015-06-19 12:36:10.360345 7fca117fa700 10 objectcacher.object(30000009691.00000004/head) map_read 30000009691.00000004 3145728~131072
2015-06-19 12:36:10.360349 7fca117fa700 20 objectcacher.object(30000009691.00000004/head) map_read gap bh[ 0x7fc9d8002900 3145728~131072 0x7fc9d966bf10 (0) v 0 missing] waiters = {}

2015-06-19 12:36:10.360407 7fca117fa700  1 -- 10.****:0/65371 --> 10.****.95:6800/31915 -- osd_op(client.97564.0:79410 30000009691.00000004 [read 3145728~131072] 3.c9946208 ack+read+known_if_redirected e12226) v5 -- ?+0 0x7fc9d810d790 con 0x7fccc008a280
2015-06-19 12:36:10.360421 7fca117fa700 20 -- 10.****:0/65371 submit_message osd_op(client.97564.0:79410 30000009691.00000004 [read 3145728~131072] 3.c9946208 ack+read+known_if_redirected e12226) v5 remote, 10.****.95:6800/31915, have pipe.

2015-06-19 12:36:10.361266 7fc9cdfeb700  3 client.97564 ll_release (fh)0x7fc9d9655660 30000009691
2015-06-19 12:36:10.361270 7fc9cdfeb700  5 client.97564 *_release_fh* 0x7fc9d9655660 mode 1 on 30000009691.head(ref=4 ll_ref=7 cap_refs={1024=1,2048=1,4096=0,8192=0} open={1=1,2=0,3=0} mode=100644 size=20971520/0 mtime=2015-06-19 12:35:06.668295 caps=pAsLsXsFscr(2=pAsLsXsFscr) objectset[30000009691 ts 0/0 objects 5 dirty_or_tx 0] parents=0x7fca0cc56b80 0x7fccd002c6b0)

2015-06-19 12:36:10.369798 7fcabb8f8700  1 -- 10.****:0/65371 <== osd.44 10.****.95:6800/31915 1941 ==== osd_op_reply(79410 30000009691.00000004 [read 3145728~131072] v0'0 uv18386 ondisk = 0) v6 ==== 187+0+131072 (2385944524 0 3796294508) 0x7fca90030a70 con 0x7fccc008a280

2015-06-19 12:36:10.398019 7fccf0ce3700  7 objectcacher bh_read_finish 30000009691.00000004/head tid 38326 3145728~131072 (bl is 131072) returned 0 *outstanding reads 31*

2015-06-19 12:36:10.398058 7fccf0ce3700 10 objectcacher readx extent(30000009691.00000004 (4) in @3 3145728~131072 -> [0,131072])
2015-06-19 12:36:10.398063 7fccf0ce3700 10 objectcacher.object(30000009691.00000004/head) map_read 30000009691.00000004 3145728~131072
2015-06-19 12:36:10.398066 7fccf0ce3700 20 objectcacher.object(30000009691.00000004/head) map_read hit bh[ 0x7fc9d957ac90 3112960~163840 0x7fc9d966bf10 (163840) v 0 clean firstbyte=0] waiters = {}

2015-06-19 12:36:10.398206 7fccf0ce3700 20 client.97564 C_Readahead on 0x159ffa4d55839c3a

As shown in ceph log, I think the reason is the file handler is released, outstanding reads don't finish yet. So when the outstanding read finish, it will access the broken Fh structure(ie. broken Mutex).
We use ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) in our I/O tests.

Files

client.patch (2.94 KB) client.patch

Zheng Yan, 07/02/2015 08:47 AM

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #12088

cephfs client crash after enable readahead mechanism through setting conf option 'client_readahead_max_bytes'

Updated by Wenjun Huang almost 9 years ago

Updated by Samuel Just almost 9 years ago

Updated by Zheng Yan almost 9 years ago

Updated by Zhi Zhang almost 9 years ago

Updated by Zheng Yan almost 9 years ago

Updated by John Spray almost 9 years ago

Updated by John Spray almost 9 years ago

Updated by Zheng Yan almost 9 years ago

Updated by Kefu Chai almost 9 years ago

Updated by Greg Farnum almost 9 years ago

Updated by Yan Shen almost 9 years ago

Updated by Zhi Zhang almost 9 years ago

Updated by Loïc Dachary almost 9 years ago