Project

General

Profile

Bug #8845

Flattening Clones of clone, results in command failure

Added by Ramakrishnan Periyasamy over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
firefly, dumpling
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

1. Created clone of clone in below manner
Create a Pool i.e. pool1
create a rbd i.e.rbd1
create a snapshot for rbd1 i.e. snap1
create a clone for snap1 i.e. clone1
Again create snapshot using clone1 i.e. snap2
Again create clone using snap2 i.e. clone2

2. First flattened Clone1 its passed(now clone1 is individual image)
3. Try flattening clone2, observed flatten command failure

sudo rbd flatten strm14/strm14clone2snap1clone
Image flatten: 1% complete...*** Caught signal (Segmentation fault)
in thread 7fb474f2b700
1: rbd() [0x42258f]
2: (()+0x10340) [0x7fb47ba0a340]
3: (librbd::aio_read(librbd::ImageCtx*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > > const&, char*, ceph::buffer::list*, librbd::AioCompletion*)+0x24) [0x7fb47cbc3854]
4: (librbd::AioRequest::read_from_parent(std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&)+0x85) [0x7fb47cba8565]
5: (librbd::AioRead::should_complete(int)+0x2b2) [0x7fb47cba8a22]
6: (librbd::rados_req_cb(void*, void*)+0x1b) [0x7fb47cbb2c0b]
7: (librados::C_AioComplete::finish(int)+0x1d) [0x7fb47bdecf1d]
8: (Context::complete(int)+0x9) [0x7fb47cba79b9]
9: (Finisher::finisher_thread_entry()+0x1b8) [0x7fb47be7aee8]
10: (()+0x8182) [0x7fb47ba02182]
11: (clone()+0x6d) [0x7fb47abd930d]
2014-07-15 16:17:31.629207 7fb474f2b700 -1
Caught signal (Segmentation fault) *
in thread 7fb474f2b700

1: rbd() [0x42258f]
2: (()+0x10340) [0x7fb47ba0a340]
3: (librbd::aio_read(librbd::ImageCtx*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > > const&, char*, ceph::buffer::list*, librbd::AioCompletion*)+0x24) [0x7fb47cbc3854]
4: (librbd::AioRequest::read_from_parent(std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&)+0x85) [0x7fb47cba8565]
5: (librbd::AioRead::should_complete(int)+0x2b2) [0x7fb47cba8a22]
6: (librbd::rados_req_cb(void*, void*)+0x1b) [0x7fb47cbb2c0b]
7: (librados::C_AioComplete::finish(int)+0x1d) [0x7fb47bdecf1d]
8: (Context::complete(int)+0x9) [0x7fb47cba79b9]
9: (Finisher::finisher_thread_entry()+0x1b8) [0x7fb47be7aee8]
10: (()+0x8182) [0x7fb47ba02182]
11: (clone()+0x6d) [0x7fb47abd930d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
111> 2014-07-15 16:17:31.597219 7fb47d2787c0 5 asok(0x167a800) register_command perfcounters_dump hook 0x167a760
-110> 2014-07-15 16:17:31.597240 7fb47d2787c0 5 asok(0x167a800) register_command 1 hook 0x167a760
-109> 2014-07-15 16:17:31.597244 7fb47d2787c0 5 asok(0x167a800) register_command perf dump hook 0x167a760
-108> 2014-07-15 16:17:31.597247 7fb47d2787c0 5 asok(0x167a800) register_command perfcounters_schema hook 0x167a760
-107> 2014-07-15 16:17:31.597251 7fb47d2787c0 5 asok(0x167a800) register_command 2 hook 0x167a760
-106> 2014-07-15 16:17:31.597256 7fb47d2787c0 5 asok(0x167a800) register_command perf schema hook 0x167a760
-105> 2014-07-15 16:17:31.597257 7fb47d2787c0 5 asok(0x167a800) register_command config show hook 0x167a760
-104> 2014-07-15 16:17:31.597260 7fb47d2787c0 5 asok(0x167a800) register_command config set hook 0x167a760
-103> 2014-07-15 16:17:31.597263 7fb47d2787c0 5 asok(0x167a800) register_command config get hook 0x167a760
-102> 2014-07-15 16:17:31.597265 7fb47d2787c0 5 asok(0x167a800) register_command log flush hook 0x167a760
-101> 2014-07-15 16:17:31.597267 7fb47d2787c0 5 asok(0x167a800) register_command log dump hook 0x167a760
-100> 2014-07-15 16:17:31.597270 7fb47d2787c0 5 asok(0x167a800) register_command log reopen hook 0x167a760
-99> 2014-07-15 16:17:31.598573 7fb47d2787c0 10 monclient(hunting): build_initial_monmap
-98> 2014-07-15 16:17:31.598619 7fb47d2787c0 1 librados: starting msgr at :/0
-97> 2014-07-15 16:17:31.598627 7fb47d2787c0 1 librados: starting objecter
-96> 2014-07-15 16:17:31.598658 7fb47d2787c0 1 -
:/0 messenger.start
95> 2014-07-15 16:17:31.598680 7fb47d2787c0 1 librados: setting wanted keys
-94> 2014-07-15 16:17:31.598681 7fb47d2787c0 1 librados: calling monclient init
-93> 2014-07-15 16:17:31.598682 7fb47d2787c0 10 monclient(hunting): init
-92> 2014-07-15 16:17:31.598686 7fb47d2787c0 5 adding auth protocol: cephx
-91> 2014-07-15 16:17:31.598691 7fb47d2787c0 10 monclient(hunting): auth_supported 2 method cephx
-90> 2014-07-15 16:17:31.598812 7fb47d2787c0 2 auth: KeyRing::load: loaded key file /etc/ceph/ceph.client.admin.keyring
-89> 2014-07-15 16:17:31.598875 7fb47d2787c0 10 monclient(hunting): _reopen_session rank -1 name
-88> 2014-07-15 16:17:31.598931 7fb47d2787c0 10 monclient(hunting): picked mon.noname-a con 0x1684210 addr 10.242.29.3:6789/0
-87> 2014-07-15 16:17:31.598961 7fb47d2787c0 10 monclient(hunting): _send_mon_message to mon.noname-a at 10.242.29.3:6789/0
-86> 2014-07-15 16:17:31.598965 7fb47d2787c0 1 -
:/1011606 --> 10.242.29.3:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x1684660 con 0x1684210
85> 2014-07-15 16:17:31.598981 7fb47d2787c0 10 monclient(hunting): renew_subs
-84> 2014-07-15 16:17:31.598985 7fb47d2787c0 10 monclient(hunting): authenticate will time out at 2014-07-15 16:22:31.598985
-83> 2014-07-15 16:17:31.599431 7fb47d270700 1 -
10.242.31.179:0/1011606 learned my addr 10.242.31.179:0/1011606
82> 2014-07-15 16:17:31.600186 7fb478032700 1 - 10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 1 ==== mon_map v1 ==== 482+0+0 (2816884619 0 0) 0x7fb468000bd0 con 0x1684210
-81> 2014-07-15 16:17:31.600230 7fb478032700 10 monclient(hunting): handle_monmap mon_map v1
-80> 2014-07-15 16:17:31.600252 7fb478032700 10 monclient(hunting): got monmap 1, mon.noname-a is now rank -1
-79> 2014-07-15 16:17:31.600262 7fb478032700 10 monclient(hunting): dump:
epoch 1
fsid 9604c907-9e5d-4fc3-9aad-bf84a1117811
last_changed 0.000000
created 0.000000
0: 10.242.29.3:6789/0 mon.ems7
1: 10.242.31.13:6789/0 mon.ems9
2: 10.242.31.179:6789/0 mon.ems8

78> 2014-07-15 16:17:31.600300 7fb478032700  1 - 10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (184034771 0 0) 0x7fb468001080 con 0x1684210
77> 2014-07-15 16:17:31.600352 7fb478032700 10 monclient(hunting): my global_id is 153549
-76> 2014-07-15 16:17:31.600557 7fb478032700 10 monclient(hunting): _send_mon_message to mon.ems7 at 10.242.29.3:6789/0
-75> 2014-07-15 16:17:31.600569 7fb478032700 1 -
10.242.31.179:0/1011606 --> 10.242.29.3:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- ?+0 0x7fb46c0015b0 con 0x1684210
74> 2014-07-15 16:17:31.601172 7fb478032700 1 - 10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 206+0+0 (2962027306 0 0) 0x7fb468001080 con 0x1684210
73> 2014-07-15 16:17:31.601322 7fb478032700 10 monclient(hunting): _send_mon_message to mon.ems7 at 10.242.29.3:6789/0
-72> 2014-07-15 16:17:31.601334 7fb478032700 1 -
10.242.31.179:0/1011606 --> 10.242.29.3:6789/0 -- auth(proto 2 165 bytes epoch 0) v1 -- ?+0 0x7fb46c001c20 con 0x1684210
71> 2014-07-15 16:17:31.602060 7fb478032700 1 - 10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 393+0+0 (2649744916 0 0) 0x7fb4680008c0 con 0x1684210
70> 2014-07-15 16:17:31.602151 7fb478032700 1 monclient(hunting): found mon.ems7
-69> 2014-07-15 16:17:31.602159 7fb478032700 10 monclient: _send_mon_message to mon.ems7 at 10.242.29.3:6789/0
-68> 2014-07-15 16:17:31.602164 7fb478032700 1 -
10.242.31.179:0/1011606 --> 10.242.29.3:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x1684ae0 con 0x1684210
67> 2014-07-15 16:17:31.602215 7fb47d2787c0 5 monclient: authenticate success, global_id 153549
-66> 2014-07-15 16:17:31.602275 7fb47d2787c0 5 asok(0x167a800) register_command objecter_requests hook 0x1685100
-65> 2014-07-15 16:17:31.602319 7fb47d2787c0 10 monclient: renew_subs
-64> 2014-07-15 16:17:31.602323 7fb47d2787c0 10 monclient: _send_mon_message to mon.ems7 at 10.242.29.3:6789/0
-63> 2014-07-15 16:17:31.602326 7fb47d2787c0 1 -
10.242.31.179:0/1011606 --> 10.242.29.3:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x1684730 con 0x1684210
62> 2014-07-15 16:17:31.602332 7fb47d2787c0 10 monclient: renew_subs
-61> 2014-07-15 16:17:31.602333 7fb47d2787c0 10 monclient: _send_mon_message to mon.ems7 at 10.242.29.3:6789/0
-60> 2014-07-15 16:17:31.602334 7fb47d2787c0 1 -
10.242.31.179:0/1011606 --> 10.242.29.3:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x1685c00 con 0x1684210
59> 2014-07-15 16:17:31.602349 7fb47d2787c0 1 librados: init done
-58> 2014-07-15 16:17:31.602613 7fb478032700 1 -
10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 5 ==== mon_map v1 ==== 482+0+0 (2816884619 0 0) 0x7fb4680012f0 con 0x1684210
-57> 2014-07-15 16:17:31.602627 7fb478032700 10 monclient: handle_monmap mon_map v1
-56> 2014-07-15 16:17:31.602651 7fb478032700 10 monclient: got monmap 1, mon.ems7 is now rank 0
-55> 2014-07-15 16:17:31.602653 7fb478032700 10 monclient: dump:
epoch 1
fsid 9604c907-9e5d-4fc3-9aad-bf84a1117811
last_changed 0.000000
created 0.000000
0: 10.242.29.3:6789/0 mon.ems7
1: 10.242.31.13:6789/0 mon.ems9
2: 10.242.31.179:6789/0 mon.ems8
54> 2014-07-15 16:17:31.602700 7fb478032700  1 - 10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 6 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (230939784 0 0) 0x7fb468001590 con 0x1684210
53> 2014-07-15 16:17:31.602723 7fb478032700 10 monclient: handle_subscribe_ack sent 2014-07-15 16:17:31.598983 renew after 2014-07-15 16:20:01.598983
-52> 2014-07-15 16:17:31.603219 7fb478032700 1 -
10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 7 ==== osd_map(448..448 src has 1..448) v3 ==== 21405+0+0 (1972018003 0 0) 0x7fb4680008c0 con 0x1684210
51> 2014-07-15 16:17:31.603551 7fb478032700 1 - 10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 8 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (230939784 0 0) 0x7fb468006770 con 0x1684210
50> 2014-07-15 16:17:31.603575 7fb478032700 10 monclient: handle_subscribe_ack sent 0.000000, ignoring
-49> 2014-07-15 16:17:31.603584 7fb478032700 1 -
10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 9 ==== osd_map(448..448 src has 1..448) v3 ==== 21405+0+0 (1972018003 0 0) 0x7fb46800be40 con 0x1684210
48> 2014-07-15 16:17:31.603606 7fb478032700 1 - 10.242.31.179:0/1011606 <== mon.0 10.242.29.3:6789/0 10 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (230939784 0 0) 0x7fb46800c2e0 con 0x1684210
47> 2014-07-15 16:17:31.603625 7fb478032700 10 monclient: handle_subscribe_ack sent 0.000000, ignoring
-46> 2014-07-15 16:17:31.603803 7fb47d2787c0 1 -
10.242.31.179:0/1011606 --> 10.242.29.84:6845/6929 -- osd_op(client.153549.0:1 strm14clone2snap1clone.rbd [stat] 16.7856ad0f ack+read e448) v4 -- ?+0 0x1687830 con 0x1687560
45> 2014-07-15 16:17:31.605722 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.24 10.242.29.84:6845/6929 1 ==== osd_op_reply(1 strm14clone2snap1clone.rbd [stat] v0'0 uv0 ack = 2 ((2) No such file or directory)) v6 ==== 193+0+0 (811750268 0 0) 0x7fb464000ab0 con 0x1687560
-44> 2014-07-15 16:17:31.605894 7fb47d2787c0 1 -
10.242.31.179:0/1011606 --> 10.242.29.83:6820/8165 -- osd_op(client.153549.0:2 rbd_id.strm14clone2snap1clone [stat] 16.a59bcf49 ack+read e448) v4 -- ?+0 0x1688120 con 0x1687de0
43> 2014-07-15 16:17:31.607763 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.8 10.242.29.83:6820/8165 1 ==== osd_op_reply(2 rbd_id.strm14clone2snap1clone [stat] v0'0 uv12 ondisk = 0) v6 ==== 196+0+16 (333596885 0 1457944454) 0x7fb45c000c20 con 0x1687de0
42> 2014-07-15 16:17:31.607881 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6820/8165 -- osd_op(client.153549.0:3 rbd_id.strm14clone2snap1clone [call rbd.get_id] 16.a59bcf49 ack+read e448) v4 -- ?+0 0x1688120 con 0x1687de0
41> 2014-07-15 16:17:31.608898 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.8 10.242.29.83:6820/8165 2 ==== osd_op_reply(3 rbd_id.strm14clone2snap1clone [call] v0'0 uv12 ondisk = 0) v6 ==== 196+0+17 (1257124260 0 94775664) 0x7fb45c000c20 con 0x1687de0
40> 2014-07-15 16:17:31.609085 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.28.255:6815/2247 -- osd_op(client.153549.0:4 rbd_header.279162eb141f2 [call rbd.get_size,call rbd.get_object_prefix] 16.8c4ebd74 ack+read e448) v4 -- ?+0 0x1688d60 con 0x1688a90
39> 2014-07-15 16:17:31.611596 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.2 10.242.28.255:6815/2247 1 ==== osd_op_reply(4 rbd_header.279162eb141f2 [call,call] v0'0 uv6 ondisk = 0) v6 ==== 233+0+35 (1172208433 0 825495175) 0x7fb454000c60 con 0x1688a90
38> 2014-07-15 16:17:31.611708 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.28.255:6815/2247 -- osd_op(client.153549.0:5 rbd_header.279162eb141f2 [call rbd.get_stripe_unit_count] 16.8c4ebd74 ack+read e448) v4 -- ?+0 0x1687050 con 0x1688a90
37> 2014-07-15 16:17:31.613484 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.2 10.242.28.255:6815/2247 2 ==== osd_op_reply(5 rbd_header.279162eb141f2 [call] v0'0 uv0 ondisk = 8 ((8) Exec format error)) v6 ==== 191+0+0 (1251890790 0 0) 0x7fb454000c60 con 0x1688a90
-36> 2014-07-15 16:17:31.613625 7fb47d2787c0 1 -
10.242.31.179:0/1011606 --> 10.242.28.255:6815/2247 -- osd_op(client.153549.0:6 rbd_header.279162eb141f2 [watch add cookie 1 ver 0] 16.8c4ebd74 ondisk+read+write e448) v4 -- ?+0 0x1689650 con 0x1688a90
35> 2014-07-15 16:17:31.616864 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.2 10.242.28.255:6815/2247 3 ==== osd_op_reply(6 rbd_header.279162eb141f2 [watch add cookie 1 ver 0] v448'32 uv6 ondisk = 0) v6 ==== 191+0+0 (908118191 0 0) 0x7fb454000c60 con 0x1688a90
34> 2014-07-15 16:17:31.617050 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.28.255:6815/2247 -- osd_op(client.153549.0:7 rbd_header.279162eb141f2 [call rbd.get_size,call rbd.get_features,call rbd.get_snapcontext,call rbd.get_parent,call lock.get_info] 16.8c4ebd74 ack+read e448) v4 -- ?+0 0x168f760 con 0x1688a90
33> 2014-07-15 16:17:31.619380 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.2 10.242.28.255:6815/2247 4 ==== osd_op_reply(7 rbd_header.279162eb141f2 [call,call,call,call,call] v0'0 uv6 ondisk = 0) v6 ==== 359+0+93 (876518903 0 3841385227) 0x7fb454000c60 con 0x1688a90
32> 2014-07-15 16:17:31.619548 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6820/8165 -- osd_op(client.153549.0:8 rbd_header.279103d1b58ba [call rbd.get_size,call rbd.get_object_prefix] 16.7d9e46bc ack+read e448) v4 -- ?+0 0x1691180 con 0x1687de0
31> 2014-07-15 16:17:31.620765 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.8 10.242.29.83:6820/8165 3 ==== osd_op_reply(8 rbd_header.279103d1b58ba [call,call] v0'0 uv28 ondisk = 0) v6 ==== 233+0+35 (2915453187 0 1695224388) 0x7fb45c000c20 con 0x1687de0
30> 2014-07-15 16:17:31.620876 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6820/8165 -- osd_op(client.153549.0:9 rbd_header.279103d1b58ba [call rbd.get_stripe_unit_count] 16.7d9e46bc ack+read e448) v4 -- ?+0 0x1691180 con 0x1687de0
29> 2014-07-15 16:17:31.621916 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.8 10.242.29.83:6820/8165 4 ==== osd_op_reply(9 rbd_header.279103d1b58ba [call] v0'0 uv0 ondisk = 8 ((8) Exec format error)) v6 ==== 191+0+0 (2188395542 0 0) 0x7fb45c000c20 con 0x1687de0
-28> 2014-07-15 16:17:31.622087 7fb47d2787c0 1 -
10.242.31.179:0/1011606 --> 10.242.29.83:6820/8165 -- osd_op(client.153549.0:10 rbd_header.279103d1b58ba [call rbd.get_size,call rbd.get_features,call rbd.get_snapcontext,call rbd.get_parent,call lock.get_info] 16.7d9e46bc ack+read e448) v4 -- ?+0 0x1691180 con 0x1687de0
27> 2014-07-15 16:17:31.623854 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.8 10.242.29.83:6820/8165 5 ==== osd_op_reply(10 rbd_header.279103d1b58ba [call,call,call,call,call] v0'0 uv28 ondisk = 0) v6 ==== 359+0+88 (1493311113 0 2847581766) 0x7fb45c0008c0 con 0x1687de0
26> 2014-07-15 16:17:31.624005 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6820/8165 -- osd_op(client.153549.0:11 rbd_header.279103d1b58ba [call rbd.get_snapshot_name,call rbd.get_size,call rbd.get_features,call rbd.get_parent,call rbd.get_protection_status] 16.7d9e46bc ack+read e448) v4 -- ?+0 0x1691180 con 0x1687de0
25> 2014-07-15 16:17:31.625612 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.8 10.242.29.83:6820/8165 6 ==== osd_op_reply(11 rbd_header.279103d1b58ba [call,call,call,call,call] v0'0 uv28 ondisk = 0) v6 ==== 359+0+88 (1178272788 0 1830059259) 0x7fb45c0008c0 con 0x1687de0
24> 2014-07-15 16:17:31.625810 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.28.255:6815/2247 -- osd_op(client.153549.0:12 rbd_data.279162eb141f2.0000000000000000 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.881aa2b4 ack+ondisk+write e448) v4 -- ?+0 0x1691180 con 0x1688a90
23> 2014-07-15 16:17:31.625917 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.28.255:6810/1820 -- osd_op(client.153549.0:13 rbd_data.279162eb141f2.0000000000000001 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.7f566f9a ack+ondisk+write e448) v4 -- ?+0 0x168a4c0 con 0x168a1e0
22> 2014-07-15 16:17:31.626004 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6800/4885 -- osd_op(client.153549.0:14 rbd_data.279162eb141f2.0000000000000002 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.9251c8c4 ack+ondisk+write e448) v4 -- ?+0 0x168ba10 con 0x168b660
21> 2014-07-15 16:17:31.626115 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6840/12346 -- osd_op(client.153549.0:15 rbd_data.279162eb141f2.0000000000000003 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.6283cd02 ack+ondisk+write e448) v4 -- ?+0 0x168cfd0 con 0x168cc20
20> 2014-07-15 16:17:31.626197 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6845/13529 -- osd_op(client.153549.0:16 rbd_data.279162eb141f2.0000000000000004 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.23175751 ack+ondisk+write e448) v4 -- ?+0 0x168e9b0 con 0x168e630
19> 2014-07-15 16:17:31.626275 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6815/7260 -- osd_op(client.153549.0:17 rbd_data.279162eb141f2.0000000000000005 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.34b4dede ack+ondisk+write e448) v4 -- ?+0 0x16930c0 con 0x1692d10
18> 2014-07-15 16:17:31.626355 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.84:6850/8234 -- osd_op(client.153549.0:18 rbd_data.279162eb141f2.0000000000000006 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.d17d22e4 ack+ondisk+write e448) v4 -- ?+0 0x1694a90 con 0x16946e0
17> 2014-07-15 16:17:31.626397 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.83:6840/12346 -- osd_op(client.153549.0:19 rbd_data.279162eb141f2.0000000000000007 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.e685456d ack+ondisk+write e448) v4 -- ?+0 0x1695e20 con 0x168cc20
16> 2014-07-15 16:17:31.626517 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.28.255:6815/2247 -- osd_op(client.153549.0:20 rbd_data.279162eb141f2.0000000000000008 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.15546d1c ack+ondisk+write e448) v4 -- ?+0 0x1697190 con 0x1688a90
15> 2014-07-15 16:17:31.626705 7fb47d2787c0 1 - 10.242.31.179:0/1011606 --> 10.242.29.84:6800/29782 -- osd_op(client.153549.0:21 rbd_data.279162eb141f2.0000000000000009 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] 16.f0d881a1 ack+ondisk+write e448) v4 -- ?+0 0x1697260 con 0x1698770
14> 2014-07-15 16:17:31.626781 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.2 10.242.28.255:6815/2247 5 ==== osd_op_reply(12 rbd_data.279162eb141f2.0000000000000000 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv16 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (3681756982 0 0) 0x7fb454000c60 con 0x1688a90
-13> 2014-07-15 16:17:31.627337 7fb474f2b700 1 -
10.242.31.179:0/1011606 --> 10.242.28.255:6815/2247 -- osd_op(client.153549.0:22 rbd_data.279103d1b58ba.0000000000000000@4 [sparse-read 0~16777216] 16.355df41c ack+read+localize_reads e448) v4 -- ?+0 0x7fb42c001e40 con 0x1688a90
12> 2014-07-15 16:17:31.627546 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.2 10.242.28.255:6815/2247 6 ==== osd_op_reply(20 rbd_data.279162eb141f2.0000000000000008 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv35 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (2854821092 0 0) 0x7fb4540008c0 con 0x1688a90
-11> 2014-07-15 16:17:31.627670 7fb474f2b700 1 -
10.242.31.179:0/1011606 --> 10.242.29.83:6815/7260 -- osd_op(client.153549.0:23 rbd_data.279103d1b58ba.0000000000000008@4 [sparse-read 0~16777216] 16.1adeaf20 ack+read+localize_reads e448) v4 -- ?+0 0x7fb42c002860 con 0x1692d10
10> 2014-07-15 16:17:31.628102 7fb478032700 1 - 10.242.31.179:0/1011606 <== osd.2 10.242.28.255:6815/2247 7 ==== osd_op_reply(22 rbd_data.279103d1b58ba.0000000000000000 [sparse-read 0~16777216] v0'0 uv0 ack = 2 ((2) No such file or directory)) v6 ==== 206+0+0 (3765106756 0 0) 0x7fb4540008c0 con 0x1688a90
-9> 2014-07-15 16:17:31.628218 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.13 10.242.29.83:6845/13529 1 ==== osd_op_reply(16 rbd_data.279162eb141f2.0000000000000004 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv23 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (676648486 0 0) 0x7fb424000b10 con 0x168e630
-8> 2014-07-15 16:17:31.628275 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.4 10.242.29.83:6800/4885 1 ==== osd_op_reply(14 rbd_data.279162eb141f2.0000000000000002 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv23 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (2171706066 0 0) 0x7fb420000b10 con 0x168b660
-7> 2014-07-15 16:17:31.628321 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.12 10.242.29.83:6840/12346 1 ==== osd_op_reply(15 rbd_data.279162eb141f2.0000000000000003 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv20 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (232833059 0 0) 0x7fb428000b10 con 0x168cc20
-6> 2014-07-15 16:17:31.628361 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.7 10.242.29.83:6815/7260 1 ==== osd_op_reply(17 rbd_data.279162eb141f2.0000000000000005 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv21 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (537071779 0 0) 0x7fb418000b10 con 0x1692d10
-5> 2014-07-15 16:17:31.628383 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.3 10.242.28.255:6810/1820 1 ==== osd_op_reply(13 rbd_data.279162eb141f2.0000000000000001 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv17 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (2355978896 0 0) 0x7fb41c000b10 con 0x168a1e0
-4> 2014-07-15 16:17:31.628415 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.12 10.242.29.83:6840/12346 2 ==== osd_op_reply(19 rbd_data.279162eb141f2.0000000000000007 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv9 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (2996431385 0 0) 0x7fb4280013f0 con 0x168cc20
-3> 2014-07-15 16:17:31.628439 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.7 10.242.29.83:6815/7260 2 ==== osd_op_reply(23 rbd_data.279103d1b58ba.0000000000000008 [sparse-read 0~16777216] v0'0 uv0 ack = 2 ((2) No such file or directory)) v6 ==== 206+0+0 (993770396 0 0) 0x7fb4180013a0 con 0x1692d10
-2> 2014-07-15 16:17:31.628522 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.25 10.242.29.84:6850/8234 1 ==== osd_op_reply(18 rbd_data.279162eb141f2.0000000000000006 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv15 ondisk = 2 ((2) No such file or directory)) v6 ==== 290+0+0 (643047444 0 0) 0x7fb410000b10 con 0x16946e0
-1> 2014-07-15 16:17:31.628554 7fb478032700 1 -
10.242.31.179:0/1011606 <== osd.15 10.242.29.84:6800/29782 1 ==== osd_op_reply(21 rbd_data.279162eb141f2.0000000000000009 [stat,set-alloc-hint object_size 16777216 write_size 16777216,write 0~0] v0'0 uv25 ondisk = -2 ((2) No such file or directory)) v6 ==== 290+0+0 (1269595953 0 0) 0x7fb414000b10 con 0x1698770
0> 2014-07-15 16:17:31.629207 7fb474f2b700 -1 ** Caught signal (Segmentation fault) *
in thread 7fb474f2b700

1: rbd() [0x42258f]
2: (()+0x10340) [0x7fb47ba0a340]
3: (librbd::aio_read(librbd::ImageCtx*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > > const&, char*, ceph::buffer::list*, librbd::AioCompletion*)+0x24) [0x7fb47cbc3854]
4: (librbd::AioRequest::read_from_parent(std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > >&)+0x85) [0x7fb47cba8565]
5: (librbd::AioRead::should_complete(int)+0x2b2) [0x7fb47cba8a22]
6: (librbd::rados_req_cb(void*, void*)+0x1b) [0x7fb47cbb2c0b]
7: (librados::C_AioComplete::finish(int)+0x1d) [0x7fb47bdecf1d]
8: (Context::complete(int)+0x9) [0x7fb47cba79b9]
9: (Finisher::finisher_thread_entry()+0x1b8) [0x7fb47be7aee8]
10: (()+0x8182) [0x7fb47ba02182]
11: (clone()+0x6d) [0x7fb47abd930d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
0/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 keyvaluestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
2/-2 (syslog threshold)
99/99 (stderr threshold)
max_recent 500
max_new 1000
log_file
--
end dump of recent events ---

Associated revisions

Revision 2545e80d (diff)
Added by Josh Durgin about 6 years ago

librbd: fix crash using clone of flattened image

The crash occurs due to ImageCtx->parent->parent being uninitialized,
since the inital open_parent() -> open_image(parent) ->
ictx_refresh(parent) occurs before ImageCtx->parent->snap_id is set,
so refresh_parent() is not called to open an ImageCtx for the parent
of the parent. This leaves the ImageCtx->parent->parent NULL, but the
rest of ImageCtx->parent updated to point at the correct parent snapshot.

Setting the parent->snap_id earlier has some unintended side effects
currently, so for now just call refresh_parent() during
open_parent(). This is the easily backportable version of the
fix. Further patches can clean up this whole initialization process.

Fixes: #8845
Backport: firefly, dumpling
Signed-off-by: Josh Durgin <>

Revision 3f020443 (diff)
Added by Josh Durgin about 6 years ago

librbd: fix crash using clone of flattened image

The crash occurs due to ImageCtx->parent->parent being uninitialized,
since the inital open_parent() -> open_image(parent) ->
ictx_refresh(parent) occurs before ImageCtx->parent->snap_id is set,
so refresh_parent() is not called to open an ImageCtx for the parent
of the parent. This leaves the ImageCtx->parent->parent NULL, but the
rest of ImageCtx->parent updated to point at the correct parent snapshot.

Setting the parent->snap_id earlier has some unintended side effects
currently, so for now just call refresh_parent() during
open_parent(). This is the easily backportable version of the
fix. Further patches can clean up this whole initialization process.

Fixes: #8845
Backport: firefly, dumpling
Signed-off-by: Josh Durgin <>
(cherry picked from commit 2545e80d274b23b6715f4d8b1f4c6b96182996fb)

Revision a67f9152 (diff)
Added by Josh Durgin about 6 years ago

librbd: fix crash using clone of flattened image

The crash occurs due to ImageCtx->parent->parent being uninitialized,
since the inital open_parent() -> open_image(parent) ->
ictx_refresh(parent) occurs before ImageCtx->parent->snap_id is set,
so refresh_parent() is not called to open an ImageCtx for the parent
of the parent. This leaves the ImageCtx->parent->parent NULL, but the
rest of ImageCtx->parent updated to point at the correct parent snapshot.

Setting the parent->snap_id earlier has some unintended side effects
currently, so for now just call refresh_parent() during
open_parent(). This is the easily backportable version of the
fix. Further patches can clean up this whole initialization process.

Fixes: #8845
Backport: firefly, dumpling
Signed-off-by: Josh Durgin <>
(cherry picked from commit 2545e80d274b23b6715f4d8b1f4c6b96182996fb)

History

#1 Updated by Ilya Dryomov about 6 years ago

  • Project changed from Ceph to rbd
  • Status changed from New to 12

#2 Updated by Josh Durgin about 6 years ago

  • Status changed from 12 to 7
  • Assignee set to Josh Durgin
  • Priority changed from Normal to Urgent

#3 Updated by Josh Durgin about 6 years ago

  • Backport set to firefly, dumpling

#4 Updated by Ilya Dryomov about 6 years ago

fsx is now able to catch this one.

#5 Updated by Josh Durgin about 6 years ago

  • Status changed from 7 to Fix Under Review

#6 Updated by Sage Weil about 6 years ago

  • Status changed from Fix Under Review to Pending Backport

#7 Updated by Josh Durgin about 6 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF