Project

General

Profile

Actions

Bug #2311

closed

rbd: delete + create image led to EEXIST

Added by Sage Weil about 12 years ago. Updated about 12 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
librbd
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Here is a sequence copy-n-pasted:

rbd rm data/905-testdisk.rbd
Removing image: 100% complete...done.
rbd create --size 2048 data/905-testdisk.rbd
rbd rm data/906-testdisk.rbd
Removing image: 100% complete...done.
rbd create --size 2048 data/906-testdisk.rbd
2012-04-18 15:40:55.321093 7fbdcae3d760 librbd: rbd image header 906-testdisk.rbd.rbd already exists
create error: (17) File exists
Error 0 creating data/906-testdisk.rbd

So now I stopped testing. I should have attached two logfiles where former delete of 907-testdisk.rbd failed...

Hope it helps,

Oliver.

[see attachments on #2178]


Files

Actions #1

Updated by Sage Weil about 12 years ago

Is it possible there is some other user, or the logs are from the wrong cluster?

I see:
- client.13507 deletes 906-testdisk.rbd (successfully)
- client.13508 creates 906-testdisk.rbd (presumably successfully)

I went looking for ways that 13508 could have gotten that error, and it is only generated at the beginning of the create sequence, before it does some subsequent work.. and I've verified that the subsequent work was done.

What commit are you running? Maybe i'm looking at the wrong version of the code.

Actions #2

Updated by Sage Weil about 12 years ago

  • Description updated (diff)
Actions #3

Updated by Oliver Francke about 12 years ago

Congrats for closing the annoying ticket #2178 :-D

Fair enough, to have a new one on this issue, here my last notes from #2178:

---
Hi Sage,

sorry, was not clear enough. The logfiles provide informations for "907-testdisk.rbd..." not "906..."
The version was from the continued tests with former 0.44-2 wip-reorder...

Hope it helps,

Oliver.


Actions #4

Updated by Sage Weil about 12 years ago

  • Status changed from Need More Info to Resolved

this is 'rbd writeback window' at its best. long live 'rbd cache'!

Actions #5

Updated by Oliver Francke about 12 years ago

Hi Sage,

uhm, not solved yet as per ceph version 0.45-207-g3053e47 (3053e4773bae93cfa3158882aa4963803862f9b2):

rbd rm data/907-testdisk.rbd
Removing image: 100% complete...done.
rbd create --size 2048 data/907-testdisk.rbd
create error: 2012-04-19 10:42:20.857085 7f5c140c5760 -1 librbd: rbd image header 907-testdisk.rbd.rbd already exists
(17) File exists
Error 0 creating data/907-testdisk.rbd

No entries in the logs.

Oliver.

Actions #6

Updated by Sage Weil about 12 years ago

  • Status changed from Resolved to In Progress

Can you generate a log? Ideally 'debug ms = 1'?

Also, attach the output of 'ceph --show-config'?

Thanks!

Actions #7

Updated by Sage Weil about 12 years ago

  • Status changed from In Progress to Need More Info

Updated by Oliver Francke about 12 years ago

Well, here we go with some output:

rbd create --size 2048 data/906-testdisk.rbd
create error: 2012-04-24 09:36:01.594379 7f53fc840760 -1 librbd: rbd image header 906-testdisk.rbd.rbd already exists(17) File exists

Error 0 creating data/906-testdisk.rbd

logfiles and --show-config output attached.

Hope this helps,

Oliver.

P.S.: Plz have a look at http://tracker.newdream.net/issues/2316, cause this one is a complete show-stopper for us. Thnx.

Actions #9

Updated by Sage Weil about 12 years ago

  • Status changed from Need More Info to Resolved
  • Priority changed from Normal to High

Ah, the problem is that the rbd head object has watchers (it is mounted) and the delete request returned EBUSY, but librbd wasn't propogating the error. That second part is fixed... see #2339 for the osd piece.

Updated by Oliver Francke about 12 years ago

Hi Sage,

yeah, this time might have been from a stale VM, but the other tests should have shown, that I normally stop all VM's. It happened there, too...

Ooops, just happened from a def. stopped VM, could have been stale though up to this morning, where I killed all qemu-processes...

Just restarted ceph with debug ms=1, perhaps s/t valuable is shown in the logs after trying to remove...?

Kind regards,

Oliver.

Actions

Also available in: Atom PDF