Project

General

Profile

Bug #38553

rbd: race condition in rbd removing

Added by Yang Dongsheng about 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When we are going to remove a rbd in different process currently, there is a possibility to get a segmentfault. A simple reproduce script as below:

[root@atest-guest build]# cat test_remove.sh 
NUM=10

for i in `seq 1 $NUM `;
do
        rbd create test_$i -s 1M >> out/remove.log 2>&1
        rbd bench-write test_$i --io-size 4098 --io-total 1M >> out/remove.log 2>&1
done

for i in `seq 1 $NUM `;
do
        rbd remove test_$i >/dev/null 2>&1 &
done

for i in `rbd ls`;
do
        rbd remove $i >/dev/null 2>&1 &
done

wait

[root@atest-guest build]# sh test_remove.sh 
test_remove.sh: line 22: 27565 Segmentation fault      (core dumped) rbd remove $i > /dev/null 2>&1
test_remove.sh: line 22: 27569 Segmentation fault      (core dumped) rbd remove $i > /dev/null 2>&1
test_remove.sh: line 22: 27575 Segmentation fault      (core dumped) rbd remove $i > /dev/null 2>&1

There is a quick fix for it:

--- a/src/journal/Journaler.cc
+++ b/src/journal/Journaler.cc
@@ -244,7 +244,10 @@ void Journaler::remove(bool force, Context *on_finish) {
     });

   on_finish = new FunctionContext([this, force, on_finish](int r) {
-      m_trimmer->remove_objects(force, on_finish);
+      if (m_trimmer != nullptr)
+        m_trimmer->remove_objects(force, on_finish);
+      else
+        on_finish->complete(r);
     });

   m_metadata->shut_down(on_finish);

But I think the correct fix should be refusing to release exclusive-lock in rbd removing.

Hi Jason, what's your opinion?

History

#1 Updated by Jason Dillaman almost 5 years ago

Do you still hit this on Nautilus? It moves images to the trash as the first step before removing, so I would think it would fail before it gets to this point.

#2 Updated by Jason Dillaman almost 5 years ago

  • Status changed from New to Need More Info

#3 Updated by Yang Dongsheng almost 5 years ago

Jason Dillaman wrote:

Do you still hit this on Nautilus? It moves images to the trash as the first step before removing, so I would think it would fail before it gets to this point.

It's gone in Nautilus. Thanx a lot

#4 Updated by Jason Dillaman almost 5 years ago

  • Status changed from Need More Info to Resolved

Also available in: Atom PDF