Bug #38553
rbd: race condition in rbd removing
Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
When we are going to remove a rbd in different process currently, there is a possibility to get a segmentfault. A simple reproduce script as below:
[root@atest-guest build]# cat test_remove.sh
NUM=10
for i in `seq 1 $NUM `;
do
rbd create test_$i -s 1M >> out/remove.log 2>&1
rbd bench-write test_$i --io-size 4098 --io-total 1M >> out/remove.log 2>&1
done
for i in `seq 1 $NUM `;
do
rbd remove test_$i >/dev/null 2>&1 &
done
for i in `rbd ls`;
do
rbd remove $i >/dev/null 2>&1 &
done
wait
[root@atest-guest build]# sh test_remove.sh
test_remove.sh: line 22: 27565 Segmentation fault (core dumped) rbd remove $i > /dev/null 2>&1
test_remove.sh: line 22: 27569 Segmentation fault (core dumped) rbd remove $i > /dev/null 2>&1
test_remove.sh: line 22: 27575 Segmentation fault (core dumped) rbd remove $i > /dev/null 2>&1
There is a quick fix for it:
--- a/src/journal/Journaler.cc
+++ b/src/journal/Journaler.cc
@@ -244,7 +244,10 @@ void Journaler::remove(bool force, Context *on_finish) {
});
on_finish = new FunctionContext([this, force, on_finish](int r) {
- m_trimmer->remove_objects(force, on_finish);
+ if (m_trimmer != nullptr)
+ m_trimmer->remove_objects(force, on_finish);
+ else
+ on_finish->complete(r);
});
m_metadata->shut_down(on_finish);
But I think the correct fix should be refusing to release exclusive-lock in rbd removing.
Hi Jason, what's your opinion?
History
#1 Updated by Jason Dillaman almost 5 years ago
Do you still hit this on Nautilus? It moves images to the trash as the first step before removing, so I would think it would fail before it gets to this point.
#2 Updated by Jason Dillaman almost 5 years ago
- Status changed from New to Need More Info
#3 Updated by Yang Dongsheng almost 5 years ago
Jason Dillaman wrote:
Do you still hit this on Nautilus? It moves images to the trash as the first step before removing, so I would think it would fail before it gets to this point.
It's gone in Nautilus. Thanx a lot
#4 Updated by Jason Dillaman almost 5 years ago
- Status changed from Need More Info to Resolved