Bug #8168
closedosd: rbd_test.test_diff_iterate fails with a cache pool
0%
Description
With a cache pool set up, test_rbd.test_diff_iterate, which removes the two snapshots it creates before removing the image, hits an error removing the image saying it still has snapshots.
From ubuntu@teuthology:/a/teuthology-2014-04-20_23:00:03-rbd-master-testing-basic-plana/206206/teuthology.log
2014-04-21T10:32:33.599 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: ====================================================================== 2014-04-21T10:32:33.599 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: ERROR: test_rbd.TestImage.test_diff_iterate 2014-04-21T10:32:33.599 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: ---------------------------------------------------------------------- 2014-04-21T10:32:33.599 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: Traceback (most recent call last): 2014-04-21T10:32:33.599 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: File "/usr/lib/python2.7/dist-packages/nose/case.py", line 384, in tearDown 2014-04-21T10:32:33.599 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: try_run(self.inst, ('teardown', 'tearDown')) 2014-04-21T10:32:33.599 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: File "/usr/lib/python2.7/dist-packages/nose/util.py", line 478, in try_run 2014-04-21T10:32:33.599 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: return func() 2014-04-21T10:32:33.600 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: File "/home/ubuntu/cephtest/mnt.0/client.0/tmp/test_rbd.py", line 230, in tearDown 2014-04-21T10:32:33.600 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: remove_image() 2014-04-21T10:32:33.600 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: File "/home/ubuntu/cephtest/mnt.0/client.0/tmp/test_rbd.py", line 48, in remove_image 2014-04-21T10:32:33.600 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: RBD().remove(ioctx, IMG_NAME) 2014-04-21T10:32:33.600 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: File "/usr/lib/python2.7/dist-packages/rbd.py", line 300, in remove 2014-04-21T10:32:33.600 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: raise make_ex(ret, 'error removing image') 2014-04-21T10:32:33.600 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]: ImageHasSnapshots: error removing image 2014-04-21T10:32:33.600 INFO:teuthology.task.workunit.client.0.err:[10.214.133.34]:
This is likely related to class methods, since those are used for reading and writing the list of existing rbd snapshots. This error does not occur on non-cache pools.
Updated by Josh Durgin about 10 years ago
The actual error is that the diff is nothing instead of a 512 byte extent being discarded. The error removing the image is just a failure during cleanup:
====================================================================== FAIL: test_rbd.TestImage.test_diff_iterate ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.6/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/joshd/ceph/src/test/pybind/test_rbd.py", line 563, in test_diff_iterate 'snap1', [(0, 512, False)]) File "/home/joshd/ceph/src/test/pybind/test_rbd.py", line 573, in check_diff eq(extents, expected) AssertionError: [] != [(0, 512, False)]
Updated by Josh Durgin about 10 years ago
So the scenario that is failing (still fails on master) is that after the objects are deleted via rbd_discard(), diff_iterate() returns [] since list_snaps returns -ENOENT on each object.
Updated by Josh Durgin about 10 years ago
Here's the osd/filestore log of one of the failing list-snaps calls:
2014-04-22 12:57:21.716311 7fa3be144710 1 -- 192.168.106.222:6801/343 <== client.4122 192.168.106.222:0/1001076 37 ==== osd_op(client.4122.0:39 rb.0.101a.74b0dc51.000000 000001@snapdir [list-snaps] 3.6ef44c3e ack+read e15) v4 ==== 182+0+0 (679743903 0 0) 0x4988480 con 0x48682c0 2014-04-22 12:57:21.716336 7fa3be144710 10 osd.0 15 do_waiters -- start 2014-04-22 12:57:21.716339 7fa3be144710 10 osd.0 15 do_waiters -- finish 2014-04-22 12:57:21.716341 7fa3be144710 20 osd.0 15 _dispatch 0x4988480 osd_op(client.4122.0:39 rb.0.101a.74b0dc51.000000000001@snapdir [list-snaps] 3.6ef44c3e ack+read e 15) v4 2014-04-22 12:57:21.716393 7fa3be144710 15 osd.0 15 require_same_or_newer_map 15 (i am 15) 0x4988480 2014-04-22 12:57:21.716397 7fa3be144710 20 osd.0 15 _share_map_incoming client.4122 192.168.106.222:0/1001076 15 2014-04-22 12:57:21.716468 7fa3be144710 15 osd.0 15 enqueue_op 0x466bb40 prio 63 cost 0 latency 0.000281 osd_op(client.4122.0:39 rb.0.101a.74b0dc51.000000000001@snapdir [ list-snaps] 3.6ef44c3e ack+read e15) v4 2014-04-22 12:57:21.716487 7fa3be144710 10 osd.0 15 do_waiters -- start 2014-04-22 12:57:21.716490 7fa3be144710 10 osd.0 15 do_waiters -- finish 2014-04-22 12:57:21.716563 7fa3b7136710 10 osd.0 15 dequeue_op 0x466bb40 prio 63 cost 0 latency 0.000376 osd_op(client.4122.0:39 rb.0.101a.74b0dc51.000000000001@snapdir [ list-snaps] 3.6ef44c3e ack+read e15) v4 pg pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 active+degraded snap trimq=[1~1]] 2014-04-22 12:57:21.716608 7fa3b7136710 20 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] op_has_sufficient_caps pool=3 (cache ) owner=0 need_read_cap=1 need_write_cap=0 need_class_read_cap=0 need_class_write_cap=0 -> yes 2014-04-22 12:57:21.716625 7fa3b7136710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] handle_message: 0x466bb40 2014-04-22 12:57:21.716638 7fa3b7136710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] do_op osd_op(client.4122.0:39 rb.0.101a.74b0dc51.000000000001@snapdir [list-snaps] 3.6ef44c3e ack+read e15) v4 may_read -> read-ordered fl ags ack+read 2014-04-22 12:57:21.716659 7fa3b7136710 15 filestore(/home/joshd/ceph/src//dev/osd0) getattr 3.2_head/6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3 '_' 2014-04-22 12:57:21.716693 7fa3b7136710 10 filestore(/home/joshd/ceph/src//dev/osd0) getattr 3.2_head/6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3 '_' = 250 2014-04-22 12:57:21.716719 7fa3b7136710 15 filestore(/home/joshd/ceph/src//dev/osd0) getattr 3.2_head/6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3 'snapset' 2014-04-22 12:57:21.716742 7fa3b7136710 10 filestore(/home/joshd/ceph/src//dev/osd0) getattr 3.2_head/6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3 'snapset' = 31 2014-04-22 12:57:21.716756 7fa3b7136710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] populate_obc_watchers 6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3 2014-04-22 12:57:21.716772 7fa3b7136710 20 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] ReplicatedPG::check_blacklisted_obc_watchers for obc 6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3 2014-04-22 12:57:21.716784 7fa3b7136710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] get_object_context: creating obc from disk: 0x49fd580 2014-04-22 12:57:21.716795 7fa3b7136710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] get_object_context: 0x49fd580 6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3 rwstate(none n=0 w=0) oi: 6ef44c3e/rb.0.101a.74b0dc51.00000 0000001/head//3(13'2 osd.0.0:63 wrlock_by=unknown.0.0:0 whiteout s 0 uv0) ssc: 0x46d49a0 snapset: 0=[]:[]+head 2014-04-22 12:57:21.716812 7fa3b7136710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] find_object_context 6ef44c3e/rb.0.101a.74b0dc51.000000000001/snapdir//3 @snapdir oi=6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3(13'2 osd.0.0:63 wrlock_by=unknown.0.0:0 whiteout s 0 uv0) 2014-04-22 12:57:21.716830 7fa3b7136710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] agent_choose_mode: num_objects: 2 num_bytes: 83 num_objects_dirty: 0 num_objects_omap: 0 num_dirty: 0 num_user_objects: 1 pool.info.targe t_max_bytes: 0 pool.info.target_max_objects: 250 2014-04-22 12:57:21.716843 7fa3b7136710 20 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 a ctive+degraded snaptrimq=[1~1]] agent_choose_mode dirty 0 full 0.016129 2014-04-22 12:57:21.716859 7fa3b7136710 25 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 active+degraded snaptrimq=[1~1]] maybe_handle_cache 6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3(13'2 osd.0.0:63 wrlock_by=unknown.0.0:0 whiteout s 0 uv0) exists missing_oid 0//0//-1 2014-04-22 12:57:21.716875 7fa3b7136710 25 osd.0 pg_epoch: 15 pg[3.2( v 13'2 (0'0,13'2] local-les=7 n=2 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 13'1 mlcod 13'1 active+degraded snaptrimq=[1~1]] do_op oi 6ef44c3e/rb.0.101a.74b0dc51.000000000001/head//3(13'2 osd.0.0:63 wrlock_by=unknown.0.0:0 whiteout s 0 uv0) 2014-04-22 12:57:21.716901 7fa3b7136710 1 -- 192.168.106.222:6801/343 --> 192.168.106.222:0/1001076 -- osd_op_reply(39 rb.0.101a.74b0dc51.000000000001 [list-snaps] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6 -- ?+0 0x475bc80 con 0x48682c0
Updated by Josh Durgin about 10 years ago
wip-8168 didn't fix it:
2014-04-22 14:38:38.520534 7f91f8e93710 1 -- 192.168.106.222:6800/16854 <== client.4115 192.168.106.222:0/1017409 37 ==== osd_op(client.4115.0:39 rb.0.1013.74b0dc51.000000000001@snapdir [list-snaps] 3.3fcf6cb2 ack+read e15) v4 ==== 182+0+0 (2263708367 0 0) 0x30beb40 con 0x37632c0 2014-04-22 14:38:38.520558 7f91f8e93710 10 osd.0 15 do_waiters -- start 2014-04-22 14:38:38.520562 7f91f8e93710 10 osd.0 15 do_waiters -- finish 2014-04-22 14:38:38.520563 7f91f8e93710 20 osd.0 15 _dispatch 0x30beb40 osd_op(client.4115.0:39 rb.0.1013.74b0dc51.000000000001@snapdir [list-snaps] 3.3fcf6cb2 ack+read e15) v4 2014-04-22 14:38:38.520615 7f91f8e93710 15 osd.0 15 require_same_or_newer_map 15 (i am 15) 0x30beb40 2014-04-22 14:38:38.520620 7f91f8e93710 20 osd.0 15 _share_map_incoming client.4115 192.168.106.222:0/1017409 15 2014-04-22 14:38:38.520690 7f91f8e93710 15 osd.0 15 enqueue_op 0x37cd3c0 prio 63 cost 0 latency 0.000256 osd_op(client.4115.0:39 rb.0.1013.74b0dc51.000000000001@snapdir [list-snaps] 3.3fcf6cb2 ack+read e15) v4 2014-04-22 14:38:38.520709 7f91f8e93710 10 osd.0 15 do_waiters -- start 2014-04-22 14:38:38.520711 7f91f8e93710 10 osd.0 15 do_waiters -- finish 2014-04-22 14:38:38.520796 7f91f1e85710 10 osd.0 15 dequeue_op 0x37cd3c0 prio 63 cost 0 latency 0.000362 osd_op(client.4115.0:39 rb.0.1013.74b0dc51.000000000001@snapdir [list-snaps] 3.3fcf6cb2 ack+read e15) v4 pg pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] 2014-04-22 14:38:38.520845 7f91f1e85710 20 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] op_has_sufficient_caps pool=3 (cache ) owner=0 need_read_cap=1 need_write_cap=0 need_class_read_cap=0 need_class_write_cap=0 -> yes 2014-04-22 14:38:38.520870 7f91f1e85710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] handle_message: 0x37cd3c0 2014-04-22 14:38:38.520883 7f91f1e85710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] do_op osd_op(client.4115.0:39 rb.0.1013.74b0dc51.000000000001@snapdir [list-snaps] 3.3fcf6cb2 ack+read e15) v4 may_read -> read-ordered flags ack+read 2014-04-22 14:38:38.520905 7f91f1e85710 15 filestore(/home/joshd/ceph/src//dev/osd0) getattr 3.2_head/3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3 '_' 2014-04-22 14:38:38.520937 7f91f1e85710 10 filestore(/home/joshd/ceph/src//dev/osd0) getattr 3.2_head/3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3 '_' = 250 2014-04-22 14:38:38.520963 7f91f1e85710 15 filestore(/home/joshd/ceph/src//dev/osd0) getattr 3.2_head/3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3 'snapset' 2014-04-22 14:38:38.520985 7f91f1e85710 10 filestore(/home/joshd/ceph/src//dev/osd0) getattr 3.2_head/3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3 'snapset' = 31 2014-04-22 14:38:38.520999 7f91f1e85710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] populate_obc_watchers 3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3 2014-04-22 14:38:38.521015 7f91f1e85710 20 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] ReplicatedPG::check_blacklisted_obc_watchers for obc 3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3 2014-04-22 14:38:38.521028 7f91f1e85710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] get_object_context: creating obc from disk: 0x37af080 2014-04-22 14:38:38.521041 7f91f1e85710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] get_object_context: 0x37af080 3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3 rwstate(none n=0 w=0) oi: 3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3(13'1 osd.0.0:60 wrlock_by=unknown.0.0:0 whiteout s 0 uv0) ssc: 0x37a57e0 snapset: 0=[]:[]+head 2014-04-22 14:38:38.521686 7f91f1e85710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] find_object_context 3fcf6cb2/rb.0.1013.74b0dc51.000000000001/snapdir//3 @snapdir oi=3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3(13'1 osd.0.0:60 wrlock_by=unknown.0.0:0 whiteout s 0 uv0) 2014-04-22 14:38:38.521704 7f91f1e85710 10 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] agent_choose_mode: num_objects: 1 num_bytes: 0 num_objects_dirty: 0 num_objects_omap: 0 num_dirty: 0 num_user_objects: 1 pool.info.target_max_bytes: 0 pool.info.target_max_objects: 250 2014-04-22 14:38:38.521718 7f91f1e85710 20 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] agent_choose_mode dirty 0 full 0.016129 2014-04-22 14:38:38.521734 7f91f1e85710 25 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] maybe_handle_cache 3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3(13'1 osd.0.0:60 wrlock_by=unknown.0.0:0 whiteout s 0 uv0) exists missing_oid 0//0//-1 2014-04-22 14:38:38.521750 7f91f1e85710 25 osd.0 pg_epoch: 15 pg[3.2( v 13'1 (0'0,13'1] local-les=7 n=1 ec=6 les/c 7/7 6/6/6) [0] r=0 lpr=6 crt=0'0 lcod 0'0 mlcod 0'0 active+degraded snaptrimq=[1~1]] do_op oi 3fcf6cb2/rb.0.1013.74b0dc51.000000000001/head//3(13'1 osd.0.0:60 wrlock_by=unknown.0.0:0 whiteout s 0 uv0) 2014-04-22 14:38:38.521779 7f91f1e85710 1 -- 192.168.106.222:6800/16854 --> 192.168.106.222:0/1017409 -- osd_op_reply(39 rb.0.1013.74b0dc51.000000000001 [list-snaps] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6 -- ?+0 0x3887780 con 0x37632c0