Bug #8628
closed
Bad ceph_osd_op.extent union access in ReplicatedPG::do_osd_ops
Added by Adam Crume almost 10 years ago.
Updated almost 10 years ago.
Description
ReplicatedPG::do_osd_ops reads and modifies ceph_osd_op.extent regardless of the operation, and therefore regardless of whether that member of the union is valid. This could result in watch.flag, clonerange.src_offset, or copy_from.flags in the ceph_osd_op being spuriously set to 0.
To replicate (in theory, untested):
1. Create a ceph_osd_op, set extent.truncate_size to -1, and extent.truncate_seq to 1
2. Re-initialize the ceph_osd_op for a watch operation, and set watch.flag to 0xff
3. Run the op on ReplicatedPG
Would it be possible to create these conditions using the API ? It cannot be unit tested, unfortunately. But it may be possible to create the proper context using a higher level and demonstrate the problem. The goal here is to show it can be reproduced in a minimal way, despite the lack of unit tests environment.
I don't think it can be done reliably through the API. It might be possible by sending a specially crafted message to the OSD, but I'm not familiar enough with the code base to set that up.
Did you run into a problem related to this ?
No, I was adding tracepoints to the function and saw the bug.
Now I understand better. It will require someone more familiar with the code than I am to figure this one out.
- Priority changed from Normal to Urgent
This was fixed in 58212b1.
- Status changed from New to Rejected
ceph_osd_op_uses_extent(op.op) guards the references ot the extent view of the union
- Status changed from Rejected to Resolved
Also available in: Atom
PDF