Actions
Bug #42828
closedrbd journal err assert(ictx->journal != __null) when release exclusive_lock
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rbd
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
ceph version: 12.2.12
Reproduction:
1. create a rbd with journaling feature.
2. test rbd with two fio client,fio command as follows:
fio --name test --rw=randwrite --bs=4k --runtime=3600 --ioengine=rbd --clientname=admin --pool=poolclz --rbdname=rbdclz --iodepth=128 --numjobs=1 --direct=1 --group_reporting --time_based=1 --eta-newline 1 --fsync=1
3. wait for while, coredump happen.
/root/rpmbuild/BUILD/ceph-12.2.12-1/src/librbd/io/AioCompletion.cc: 86: FAILED assert(ictx->journal != __null)
coredump file:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/libexec/qemu-kvm -name guest=instance-00000082,debug-threads=on -S -object'.
Program terminated with signal 6, Aborted.
#0 0x00007f42a6f261f7 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install qemu-kvm-2.9.0-16.el7.centos.es_4.5.1.2.x86_64
(gdb) bt
#0 0x00007f42a6f261f7 in raise () from /lib64/libc.so.6
#1 0x00007f42a6f278e8 in abort () from /lib64/libc.so.6
#2 0x00007f429c02b744 in ceph::__ceph_assert_fail(char const*, char const*, int, char const*) () from /usr/lib64/ceph/libceph-common.so.0
#3 0x00007f42ac35c4d3 in librbd::io::AioCompletion::complete() () from /lib64/librbd.so.1
#4 0x00007f42ac35d14f in librbd::io::AioCompletion::complete_request(long) () from /lib64/librbd.so.1
#5 0x00007f42ac35e511 in librbd::io::(anonymous namespace)::C_FlushJournalCommit<librbd::ImageCtx>::finish(int) () from /lib64/librbd.so.1
#6 0x00007f42ac29c829 in Context::complete(int) () from /lib64/librbd.so.1
#7 0x00007f42ac2aaac4 in ContextWQ::process(Context*) () from /lib64/librbd.so.1
#8 0x00007f429c03385e in ThreadPool::worker(ThreadPool::WorkThread*) () from /usr/lib64/ceph/libceph-common.so.0
#9 0x00007f429c034780 in ThreadPool::WorkThread::entry() () from /usr/lib64/ceph/libceph-common.so.0
#10 0x00007f42a72bbe25 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f42a6fe934d in clone () from /lib64/libc.so.6
The reason is prerelase exclusive_lock not wait for all aio complete. when one aio flush complete and just in time the journal has been closed cause assert(ictx->journal != __null) err.
I wonder if we need to wait for all aio complete in prerelease exclusive_lock or anything else need to consider.
Files
Actions