Bug #11793
closedtest: rados.py does not do cache_(evict|flush|try_flush) since it leaves out the cache_
0%
Description
Enabling it causes:
2015-05-27T12:23:17.122 INFO:tasks.rados.rados.0.plana93.stdout:update_object_version oid 489 v 0 (ObjNum 99675960 snap 0 seq_num 99675960) clean dne
2015-05-27T12:23:17.123 INFO:tasks.rados.rados.0.plana93.stderr:1028: Error: oid 291 read returned error code -2
2015-05-27T12:23:17.155 INFO:tasks.rados.rados.0.plana93.stderr:./test/osd/RadosModel.h: In function 'virtual void ReadOp::_finish(TestOp::CallbackInfo*)' thread 7f22ceffd700 time 2015-05-27 12:23:17.120327
2015-05-27T12:23:17.156 INFO:tasks.rados.rados.0.plana93.stderr:./test/osd/RadosModel.h: 1094: FAILED assert(0)
2015-05-27T12:23:17.178 INFO:tasks.rados.rados.0.plana93.stderr: ceph version 9.0.0-1069-gbd99bce (bd99bcedc7d29268a593abef1afbaf62d7e13fd4)
2015-05-27T12:23:17.179 INFO:tasks.rados.rados.0.plana93.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x4e124b]
2015-05-27T12:23:17.179 INFO:tasks.rados.rados.0.plana93.stderr: 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4d11dc]
2015-05-27T12:23:17.179 INFO:tasks.rados.rados.0.plana93.stderr: 3: (()+0xb57ed) [0x7f22e117b7ed]
2015-05-27T12:23:17.179 INFO:tasks.rados.rados.0.plana93.stderr: 4: (()+0x906f9) [0x7f22e11566f9]
2015-05-27T12:23:17.180 INFO:tasks.rados.rados.0.plana93.stderr: 5: (()+0x155a18) [0x7f22e121ba18]
2015-05-27T12:23:17.180 INFO:tasks.rados.rados.0.plana93.stderr: 6: (()+0x8182) [0x7f22e0c9a182]
2015-05-27T12:23:17.180 INFO:tasks.rados.rados.0.plana93.stderr: 7: (clone()+0x6d) [0x7f22df61efbd]
2015-05-27T12:23:17.180 INFO:tasks.rados.rados.0.plana93.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2015-05-27T12:23:17.198 INFO:tasks.rados.rados.0.plana93.stderr:terminate called after throwing an instance of 'ceph::FailedAssertion'
ubuntu@teuthology:/a/samuelj-2015-05-27_12:06:32-rados-wip-sam-testing-distro-basic-multi/911824
Updated by Zhiqiang Wang almost 9 years ago
This is what I see from the log for oid plana9327095-291:
1. At first, pool 2 is the cache pool of pool 1. For plana9327095-291, osd.5 is the primary cache osd, osd.4 is the primary base osd
2. Some ops are done on plana9327095-291
3. A cache-evict op evicts plana9327095-291 from cache pool
4. The cache relationship is removed
5. A cache-flush op on plana9327095-291 is sent to osd.4. However, the pool of this op is still 2. The op can't find an object with oid plana9327095-291 and pool id 2, and returns.
6. A later read op on plana9327095-291 is sent to osd.4. Again, the pool is 2. Read returns with ENOENT.
I think the bug here is that after removing the cache relationship, the pool id of the object locater of the later ops is not updated to the base pool, it's still the cache pool.
Updated by Greg Farnum about 7 years ago
- Subject changed from rados.py does not do cache_(evict|flush|try_flush) since it leaves out the cache_ to test: rados.py does not do cache_(evict|flush|try_flush) since it leaves out the cache_
Updated by Sage Weil almost 7 years ago
- Status changed from New to 12
- Priority changed from Urgent to Immediate
verify whether this is still the case
Updated by Sage Weil almost 7 years ago
- Priority changed from Immediate to Urgent
Updated by Sage Weil almost 7 years ago
- Status changed from 12 to In Progress
- Assignee set to Sage Weil
Updated by Sage Weil almost 7 years ago
- Status changed from In Progress to Fix Under Review
http://pulpito.ceph.com/sage-2017-07-09_19:43:05-rados:thrash-wip-11793-distro-basic-smithi/
100%, except for one unrelated bluestore crash.
Updated by Sage Weil almost 7 years ago
- Status changed from Fix Under Review to Resolved