Project

General

Profile

Bug #51419

bufferlist::splice() may cause stack corruption in bufferlist::rebuild_aligned_size_and_memory()

Added by CONGMIN YIN over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  • stack smashing detected ***: terminated2073 IOPS][eta 02h:59m:36s]
    --Type <RET> for more, q to quit, c to continue without paging--

Thread 138 "tp_pwl" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe0d8d700 (LWP 5815)]
GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 _GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff6021859 in __GI_abort () at abort.c:79
#2 0x00007ffff608c3ee in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff61b607c "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007ffff612eb4a in __GI
_fortify_fail (msg=msg@entry=0x7ffff61b6064 "stack smashing detected") at fortify_fail.c:26
#4 0x00007ffff612eb16 in __stack_chk_fail () at stack_chk_fail.c:24
#5 0x00007ffff7826116 in ceph::buffer::v15_2_0::list::rebuild_aligned_size_and_memory (this=0x7fffe0d890e0, align_size=4096, align_memory=4096, max_buffers=1024) at ../src/common/buffer.cc:1266
#6 0x00007fffe3c3be65 in KernelDevice::aio_write (this=0x555556025900, off=8192, bl=..., ioc=0x55555b3f9c08, buffered=false, write_hint=0) at ../src/blk/kernel/KernelDevice.cc:938
#7 0x00007fffe3c01122 in librbd::cache::pwl::ssd::WriteLog<librbd::ImageCtx>::write_log_entries (this=0x5555560e6a00, log_entries=std::vector of length 32, capacity 32 = {...}, aio=0x55555b3f9c00,
pos=0x5555560515a0) at ../src/librbd/cache/pwl/ssd/WriteLog.cc:841
#8 0x00007fffe3c00620 in librbd::cache::pwl::ssd::WriteLog<librbd::ImageCtx>::append_ops (this=0x5555560e6a00, Python Exception <class 'AttributeError'> 'NoneType' object has no attribute 'pointer':
ops=std::
_cxx11::list, ctx=0x55555df9d6c0, new_first_free_entry=0x5555560515a0)
at ../src/librbd/cache/pwl/ssd/WriteLog.cc:759
#9 0x00007fffe3bffcd5 in librbd::cache::pwl::ssd::WriteLog<librbd::ImageCtx>::append_op_log_entries (this=0x5555560e6a00, Python Exception <class 'AttributeError'> 'NoneType' object has no attribute 'pointer':
ops=std::__cxx11::list) at ../src/librbd/cache/pwl/ssd/WriteLog.cc:455
#10 0x00007fffe3bfcea3 in librbd::cache::pwl::ssd::WriteLog<librbd::ImageCtx>::append_scheduled_ops (this=0x5555560e6a00) at ../src/librbd/cache/pwl/ssd/WriteLog.cc:388
#11 0x00007fffe3c0453c in librbd::cache::pwl::ssd::WriteLog<librbd::ImageCtx>::enlist_op_appender()::{lambda(int)#1}::operator()(int) const (this=0x5555560e6a00, r=0)
at ../src/librbd/cache/pwl/ssd/WriteLog.cc:326
#12 0x00007fffe3c13598 in LambdaContext<librbd::cache::pwl::ssd::WriteLog<librbd::ImageCtx>::enlist_op_appender()::{lambda(int)#1}>::finish(int) (this=0x55555b323970, r=0)
at ../src/include/Context.h:166
#13 0x00007ffff4c24da7 in Context::complete (this=0x55555b323970, r=0) at ../src/include/Context.h:99
#14 0x00007fffe3b8ae4b in ContextWQ::process (this=0x5555560e7248, ctx=0x55555b323970) at ../src/common/WorkQueue.h:562
#15 0x00007fffe3bc5b9c in ThreadPool::PointerWQ<Context>::_void_process (this=0x5555560e7248, item=0x55555b323970, handle=...) at ../src/common/WorkQueue.h:347
#16 0x00007ffff4c50e75 in ThreadPool::worker (this=0x5555560e6c90, wt=0x55555b062240) at ../src/common/WorkQueue.cc:117
#17 0x00007ffff4c55122 in ThreadPool::WorkThread::entry (this=0x55555b062240) at ../src/common/WorkQueue.h:401
#18 0x00007ffff4c2c055 in Thread::entry_wrapper (this=0x55555b062240) at ../src/common/Thread.cc:87
#19 0x00007ffff4c2bfc4 in Thread::_entry_func (arg=0x55555b062240) at ../src/common/Thread.cc:74
#20 0x00007ffff7d84609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#21 0x00007ffff611e293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95


Related issues

Related to RADOS - Bug #53969: BufferList.rebuild_aligned_size_and_memory failure Resolved
Copied to RADOS - Backport #51604: octopus: bufferlist::splice() may cause stack corruption in bufferlist::rebuild_aligned_size_and_memory() Resolved
Copied to RADOS - Backport #51605: pacific: bufferlist::splice() may cause stack corruption in bufferlist::rebuild_aligned_size_and_memory() Resolved

History

#1 Updated by Kefu Chai over 2 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 42112

#2 Updated by Kefu Chai over 2 years ago

  • Project changed from rbd to RADOS

#3 Updated by Ilya Dryomov over 2 years ago

  • Subject changed from [pwl ssd]sig abort to bufferlist::splice() may cause stack corruption in bufferlist::rebuild_aligned_size_and_memory()
  • Assignee set to CONGMIN YIN

#4 Updated by Ilya Dryomov over 2 years ago

Initially triggered with fio when testing rbd persistent write-back cache in ssd mode:

[global]
ioengine=rbd
clientname=admin
rw=randwrite
bs=16k
time_based=1
runtime=20s
iodepth=16
pool=rbd
group_reporting

[volumes]
rbdname=fio-test

#5 Updated by Josh Durgin over 2 years ago

  • Backport set to octopus, pacific

#6 Updated by Ilya Dryomov over 2 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport changed from octopus, pacific to octopus,pacific

#7 Updated by Backport Bot over 2 years ago

  • Copied to Backport #51604: octopus: bufferlist::splice() may cause stack corruption in bufferlist::rebuild_aligned_size_and_memory() added

#8 Updated by Backport Bot over 2 years ago

  • Copied to Backport #51605: pacific: bufferlist::splice() may cause stack corruption in bufferlist::rebuild_aligned_size_and_memory() added

#9 Updated by Loïc Dachary over 2 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

#10 Updated by Neha Ojha about 2 years ago

  • Related to Bug #53969: BufferList.rebuild_aligned_size_and_memory failure added

Also available in: Atom PDF