Project

General

Profile

Actions

Bug #4100

closed

rbd: unprotecting a snapshot in the "UNPROTECTING" state fails with EINVAL

Added by Josh Durgin about 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
bobtail
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

As reported on ceph-users, an unprotect earlier was taking too long (possibly due to inactive pgs) and was killed. This left the snapshot in the UNPROTECTING state. Another unprotect resulted in:

# rbd snap unprotect rd/debian_template@snap --debug-rbd 20 --debug-ms 1
2013-02-11 20:52:35.570797 7f89078bd780  1 -- :/0 messenger.start
2013-02-11 20:52:35.571165 7f89078bd780  1 -- :/1009032 --> 10.1.91.11:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x1a9ada0 con 0x1a9a8b0
2013-02-11 20:52:35.571601 7f89078b9700  1 -- 10.1.91.11:0/1009032 learned my addr 10.1.91.11:0/1009032
2013-02-11 20:52:35.572485 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== mon.0 10.1.91.11:6789/0 1 ==== mon_map v1 ==== 191+0+0 (2908705101 0 0) 0x7f88f80009f0 con 0x1a9a8b0
2013-02-11 20:52:35.572645 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== mon.0 10.1.91.11:6789/0 2 ==== auth_reply(proto 1 0 Success) v1 ==== 24+0+0 (3088633910 0 0) 0x7f88f8000d80
 con 0x1a9a8b0
2013-02-11 20:52:35.572704 7f89029f2700  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x1a9c340 con 0x1a9a8b0
2013-02-11 20:52:35.573048 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x1a9ba50 con 0x1a9a8b0
2013-02-11 20:52:35.573075 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x1a9c610 con 0x1a9a8b0
2013-02-11 20:52:35.573370 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== mon.0 10.1.91.11:6789/0 3 ==== mon_map v1 ==== 191+0+0 (2908705101 0 0) 0x7f88f80009f0 con 0x1a9a8b0
2013-02-11 20:52:35.573425 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== mon.0 10.1.91.11:6789/0 4 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1252080666 0 0) 0x7f88f8000d80 con 0
x1a9a8b0
2013-02-11 20:52:35.573730 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== mon.0 10.1.91.11:6789/0 5 ==== osd_map(55..55 src has 1..55) v3 ==== 12412+0+0 (1795712789 0 0) 0x7f88f8003
9b0 con 0x1a9a8b0
2013-02-11 20:52:35.574030 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== mon.0 10.1.91.11:6789/0 6 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1252080666 0 0) 0x7f88f8003df0 con 0
x1a9a8b0
2013-02-11 20:52:35.574142 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== mon.0 10.1.91.11:6789/0 7 ==== osd_map(55..55 src has 1..55) v3 ==== 12412+0+0 (1795712789 0 0) 0x7f88f8003
9b0 con 0x1a9a8b0
2013-02-11 20:52:35.574169 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== mon.0 10.1.91.11:6789/0 8 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (1252080666 0 0) 0x7f88f8003df0 con 0
x1a9a8b0
2013-02-11 20:52:35.574232 7f89078bd780 20 librbd: open_image: ictx = 0x1a9cbc0 name = 'debian_template' id = '' snap_name = ''
2013-02-11 20:52:35.574327 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6809/6197 -- osd_op(client.4444.0:1 debian_template.rbd [stat] 4.13040218) v4 -- ?+0 0x1a9dff0 c
on 0x1a9dcc0
2013-02-11 20:52:35.575691 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.3 10.1.91.11:6809/6197 1 ==== osd_op_reply(1 debian_template.rbd [stat] = -2 (No such file or directory
)) v4 ==== 118+0+0 (3022415672 0 0) 0x7f88e00009a0 con 0x1a9dcc0
2013-02-11 20:52:35.575897 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6809/6197 -- osd_op(client.4444.0:2 rbd_id.debian_template [stat] 4.5574cf14) v4 -- ?+0 0x1a9dff
0 con 0x1a9dcc0
2013-02-11 20:52:35.576703 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.3 10.1.91.11:6809/6197 2 ==== osd_op_reply(2 rbd_id.debian_template [stat] = 0) v4 ==== 121+0+16 (19411
11872 0 2486853210) 0x7f88e0000dc0 con 0x1a9dcc0
2013-02-11 20:52:35.576788 7f89078bd780 20 librbd: detect format of debian_template : new
2013-02-11 20:52:35.576866 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6809/6197 -- osd_op(client.4444.0:3 rbd_id.debian_template [call rbd.get_id] 4.5574cf14) v4 -- ?
+0 0x1a9e5b0 con 0x1a9dcc0
2013-02-11 20:52:35.577633 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.3 10.1.91.11:6809/6197 3 ==== osd_op_reply(3 rbd_id.debian_template [call] = 0) v4 ==== 121+0+16 (27681
63342 0 1805449227) 0x7f88e0000a60 con 0x1a9dcc0
2013-02-11 20:52:35.577852 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6803/5976 -- osd_op(client.4444.0:4 rbd_header.10ed6b8b4567 [call rbd.get_size,call rbd.get_obje
ct_prefix] 4.a33f8c74) v4 -- ?+0 0x1aa2010 con 0x1a9ec50
2013-02-11 20:52:35.579206 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.1 10.1.91.11:6803/5976 1 ==== osd_op_reply(4 rbd_header.10ed6b8b4567 [call,call] = 0) v4 ==== 164+0+34 
(2238100216 0 1423405897) 0x7f88d8000a90 con 0x1a9ec50
2013-02-11 20:52:35.579351 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6803/5976 -- osd_op(client.4444.0:5 rbd_header.10ed6b8b4567 [call rbd.get_stripe_unit_count] 4.a
33f8c74) v4 -- ?+0 0x1a9dff0 con 0x1a9ec50
2013-02-11 20:52:35.580352 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.1 10.1.91.11:6803/5976 2 ==== osd_op_reply(5 rbd_header.10ed6b8b4567 [call] = 0) v4 ==== 122+0+16 (4210
705 0 74706562) 0x7f88d8000a90 con 0x1a9ec50
2013-02-11 20:52:35.580427 7f89078bd780 10 librbd::ImageCtx: init_layout stripe_unit 4194304 stripe_count 1 object_size 4194304 prefix rbd_data.10ed6b8b4567 format rbd_data.10ed6
b8b4567.%016llx
2013-02-11 20:52:35.580498 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6803/5976 -- osd_op(client.4444.0:6 rbd_header.10ed6b8b4567 [watch 1~0] 4.a33f8c74) v4 -- ?+0 0x
1aa2960 con 0x1a9ec50
2013-02-11 20:52:35.582238 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.1 10.1.91.11:6803/5976 3 ==== osd_op_reply(6 rbd_header.10ed6b8b4567 [watch 1~0] ondisk = 0) v4 ==== 12
2+0+0 (2406224095 0 0) 0x7f88d8000a90 con 0x1a9ec50
2013-02-11 20:52:35.582346 7f89078bd780 20 librbd: ictx_refresh 0x1a9cbc0
2013-02-11 20:52:35.582437 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6803/5976 -- osd_op(client.4444.0:7 rbd_header.10ed6b8b4567 [call rbd.get_size,call rbd.get_feat
ures,call rbd.get_snapcontext,call rbd.get_parent,call lock.get_info] 4.a33f8c74) v4 -- ?+0 0x1aa26f0 con 0x1a9ec50
2013-02-11 20:52:35.583438 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.1 10.1.91.11:6803/5976 4 ==== osd_op_reply(7 rbd_header.10ed6b8b4567 [call,call,call,call,call] = 0) v4
 ==== 290+0+88 (1809167792 0 3048450296) 0x7f88d8000a90 con 0x1a9ec50
2013-02-11 20:52:35.583551 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6803/5976 -- osd_op(client.4444.0:8 rbd_header.10ed6b8b4567 [call rbd.get_snapshot_name,call rbd
.get_size,call rbd.get_features,call rbd.get_parent,call rbd.get_protection_status] 4.a33f8c74) v4 -- ?+0 0x1aa26f0 con 0x1a9ec50
2013-02-11 20:52:35.584565 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.1 10.1.91.11:6803/5976 5 ==== osd_op_reply(8 rbd_header.10ed6b8b4567 [call,call,call,call,call] = 0) v4
 ==== 290+0+62 (2211467652 0 1673312616) 0x7f88d8000ef0 con 0x1a9ec50
2013-02-11 20:52:35.584644 7f89078bd780 20 librbd: new snapshot id=6 name=snap size=5242880000 features=3
rbd: unprotecting snap failed: (22) Invalid argument2013-02-11 20:52:35.584724 7f89078bd780 20 librbd: snap_unprotect 0x1a9cbc0 snap

2013-02-11 20:52:35.584730 7f89078bd780 20 librbd: ictx_check 0x1a9cbc0
2013-02-11 20:52:35.584874 7f89078bd780 20 librbd: close_image 0x1a9cbc0
2013-02-11 20:52:35.584878 7f89078bd780 20 librbd: flush 0x1a9cbc0
2013-02-11 20:52:35.584880 7f89078bd780 20 librbd: ictx_check 0x1a9cbc0
2013-02-11 20:52:35.584904 7f89078bd780  1 -- 10.1.91.11:0/1009032 --> 10.1.91.11:6803/5976 -- osd_op(client.4444.0:9 rbd_header.10ed6b8b4567 [watch 1~0] 4.a33f8c74) v4 -- ?+0 0x
1aa4480 con 0x1a9ec50
2013-02-11 20:52:35.586857 7f89029f2700  1 -- 10.1.91.11:0/1009032 <== osd.1 10.1.91.11:6803/5976 6 ==== osd_op_reply(9 rbd_header.10ed6b8b4567 [watch 1~0] = 0) v4 ==== 122+0+0 (
545292060 0 0) 0x7f88d8000ae0 con 0x1a9ec50
2013-02-11 20:52:35.587162 7f89078bd780  1 -- 10.1.91.11:0/1009032 mark_down 0x1a9ec50 -- 0x1a9ea00
2013-02-11 20:52:35.587259 7f89078bd780  1 -- 10.1.91.11:0/1009032 mark_down 0x1a9dcc0 -- 0x1a9da70
2013-02-11 20:52:35.587517 7f89078bd780  1 -- 10.1.91.11:0/1009032 mark_down_all
2013-02-11 20:52:35.587645 7f89078bd780  1 -- 10.1.91.11:0/1009032 shutdown complete.
Actions #1

Updated by Josh Durgin about 11 years ago

  • Status changed from 12 to Fix Under Review

wip-snap-unprotect

Actions #2

Updated by Ian Colle about 11 years ago

  • Assignee changed from Josh Durgin to Dan Mick

Dan - please review the wip branch.

Actions #3

Updated by Dan Mick about 11 years ago

  • Assignee changed from Dan Mick to Josh Durgin
Actions #4

Updated by Josh Durgin about 11 years ago

  • Status changed from Fix Under Review to Resolved

commit:fe283813b44a7c45def6768ea0788a3a0635957e and commit:bfb4482c4596759b464caf45f8f30368898519d8 in bobtail.

Actions

Also available in: Atom PDF