Bug #9362
librados, rados_read corrupts memory on timeout
0%
Description
If you configure librados with rados_osd_op_timeout, timeouts on rados_read will result in memory corruptions and segmentation faults.
If rados_read returns with a timeout (-110) the output buffer provided to rados_read is accessed after rados_read returned. This either results in the given buffer changing content after the call or, if the buffer is freed immediately, in a memory corruption.
During testing the timeout feature in a multithreading environment we experienced many crashes of the processes using librados and even worse had corrupted objects being written to ceph. These corrupted objects contained content of an object where rados_read resulted in a timeout at exactly the same time where the corrupted object was written to ceph.
Attached you can find a small program to reproduce the problem.
The problem was reproduced with librados 0.67.10 and 0.80.5
Related issues
Associated revisions
osdc/Objecter: revoke rx_buffer on op_cancel
If we cancel a read, revoke the rx buffers to avoid a use-after-free and/or
other undefined badness by using user buffers that may no longer be
present.
Fixes: #9362
Backport: firefly, dumpling
Reported-by: Matthias Kiefer <matthias.kiefer@1und1.de>
Signed-off-by: Sage Weil <sage@redhat.com>
osdc/Objecter: revoke rx_buffer on op_cancel
If we cancel a read, revoke the rx buffers to avoid a use-after-free and/or
other undefined badness by using user buffers that may no longer be
present.
Fixes: #9362
Backport: firefly, dumpling
Reported-by: Matthias Kiefer <matthias.kiefer@1und1.de>
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2305b2897acba38384358c33ca3bbfcae6f1c74e)
(adjusted for op->con instead of s->con)
osdc/Objecter: revoke rx_buffer on op_cancel
If we cancel a read, revoke the rx buffers to avoid a use-after-free and/or
other undefined badness by using user buffers that may no longer be
present.
Fixes: #9362
Backport: firefly, dumpling
Reported-by: Matthias Kiefer <matthias.kiefer@1und1.de>
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 2305b2897acba38384358c33ca3bbfcae6f1c74e)
(adjusted for op->con instead of s->con)
History
#1 Updated by Matthias Kiefer about 9 years ago
I have also tried to reproduce this problem with a firewall dropping incoming packages from the primary osd the object is located on. In this case rados_read also returns the timeout error, but nothing is written into the buffer and no crash happens.
#2 Updated by Sage Weil about 9 years ago
- Priority changed from High to Urgent
which version of librados is this? Thanks!
#3 Updated by Greg Farnum about 9 years ago
Sage pointed out elsewhere (and I'm with him) that it looks like the actual response is coming in and then the messenger is writing into the given data buffers. We'll need to clear them out.
I pushed the minimal patch to wip-dumpling-9362. (Obviously, it's based on the dumpling branch, which will eventually be .67.11.) It has absolutely no testing, but if you can test it (just on the client; this won't require any changes to the OSDs) and let us know...
#4 Updated by Sage Weil about 9 years ago
- Status changed from New to Fix Under Review
- Source changed from other to Community (dev)
#5 Updated by Sage Weil about 9 years ago
- Assignee set to Sage Weil
#6 Updated by Matthias Kiefer about 9 years ago
- File core.segfault.gz added
- Failed assertion:
osdc/Objecter.cc: In function 'void Objecter::handle_osd_op_reply(MOSDOpReply*)' thread 7f6d487f4700 time 2014-09-09 08:53:30.384945 osdc/Objecter.cc: 2453: FAILED assert(initialized.read()) ceph version 0.85-684-g74f22fd (74f22fdf4f98f18c2632f213e2a3baeba6576d27) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f6d509e56ab] 2: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x1a2c) [0x7f6d5096a1fc] 3: (Objecter::ms_dispatch(Message*)+0x283) [0x7f6d50973d43] 4: (DispatchQueue::fast_dispatch(Message*)+0x56) [0x7f6d50af1196] 5: (Pipe::reader()+0x1a54) [0x7f6d50b56f84] 6: (Pipe::Reader::entry()+0xd) [0x7f6d50b60b9d] 7: (()+0x8182) [0x7f6d4fa85182] 8: (clone()+0x6d) [0x7f6d501affbd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after throwing an instance of 'ceph::FailedAssertion'
- Segmentation Fault: (stack trace from core dump (attached))
Program terminated with signal SIGSEGV, Segmentation fault. #0 pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:41 41 ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S: No such file or directory. (gdb) where #0 pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:41 #1 0x00007fd6aa5b02cf in RWLock::get_write (this=0x18, lockdep=<optimized out>) at ./common/RWLock.h:88 #2 0x00007fd6aa5964f0 in Objecter::op_cancel (this=0xa1dd10, s=0x0, tid=20535, r=-110) at osdc/Objecter.cc:1851 #3 0x00007fd6aa55ff29 in Context::complete (this=0xa1b820, r=<optimized out>) at ./include/Context.h:64 #4 0x00007fd6aa61c65f in RWTimer::timer_thread (this=0xa1de30) at common/Timer.cc:268 #5 0x00007fd6aa61db2d in RWTimerThread::entry (this=<optimized out>) at common/Timer.cc:200 #6 0x00007fd6a96bf182 in start_thread (arg=0x7fd6a4e35700) at pthread_create.c:312 #7 0x00007fd6a9de9fbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
#7 Updated by Sage Weil about 9 years ago
Thanks, Matthias. Those look like unrelated problems with the master branch.
#8 Updated by Greg Farnum about 9 years ago
- Status changed from Fix Under Review to Pending Backport
Merged to master in 0d66922cfbe5ea6c0c8dc57e2ecf9af58e6926de.
#9 Updated by Sage Weil about 9 years ago
- Status changed from Pending Backport to Resolved
#10 Updated by Matthias Kiefer about 9 years ago
I did another test today using the build from http://gitbuilder.ceph.com/ceph-deb-wheezy-x86_64-basic/ref/wip-dumpling-9362/pool/main/c/ceph/ and reliably get a crash (if the timeout is set too a low enough value so that timeouts happen) with the following assertion:
/usr/include/boost/smart_ptr/intrusive_ptr.hpp:161: T* boost::intrusive_ptr<T>::operator->() const [with T = Connection]: Assertion `px != 0' failed.
Can it be that the patch in the dumpling branch (https://github.com/ceph/ceph/commit/919cc180b57dbca1e2917f69d02d198f04086d8d)
Op *op = p->second; + + op->con->revoke_rx_buffer(op->tid); + if (op->onack) {
is missing the pointer check done in the master branch (https://github.com/ceph/ceph/commit/2305b2897acba38384358c33ca3bbfcae6f1c74e)?:
+ if (s->con) { + ldout(cct, 20) << " revoking rx buffer for " << tid + << " on " << s->con << dendl; + s->con->revoke_rx_buffer(tid); + } +
#11 Updated by Matthias Kiefer about 9 years ago
Update: The patch branch I used did not contain the complete code that has been merged to the dumpling branch. Using version http://gitbuilder.ceph.com/ceph-deb-wheezy-x86_64-basic/ref/dumpling/pool/main/c/ceph/librados2_0.67.10-13-g3f02044-1wheezy_amd64.deb, I can confirm that the crash is indeed fixed in the dumpling branch.
However, I found another crash in librados for which I created a new issue: http://tracker.ceph.com/issues/9582
Thanks for fixing this problem!