Throttle.cc: 194: FAILED assert(c >= 0) due to invalid ceph_osd_op union
I've found an obscure bug that has been in Ceph from 0.22 to the current tip of
the master branch thanks to this assertion error that started happening on a
specific RBD in a Ceph cluster we're using in production.
The notify op gets created in this code path,
which sets op.watch.ver to the last version of the object. In the case
of the problematic volume that triggered this issue for us, the
version is 3197560675. This goes into the union in ceph_osd_op, and
happens to occupy the same position that the extent length would.
When the flow of execution gets to calc_op_budget, the definition of
CEPH_OSD_OP_NOTIFY has the CEPH_OSD_OP_MODE_READ and CEPH_OSD_OP_TYPE_DATA bits
set, so it ends up in the code path that looks at the extent length
It looks as though the solution to this is to add a more stringent check to
calc_up_budget to ensure that only operation types that use the extent
interpretation of the union at the time the operation is created end
up in the code path that checks extent length, similar to commit 58212b1,
so I'm going to submit a PR with that change shortly.
There's also a small amount of discussion about this on the mailing list.
#1 Updated by Patrick Donnelly 8 months ago
- Project changed from Ceph to rbd
- Category deleted (
- Status changed from New to Need Review
- Assignee set to Simon Ruggier
- Target version set to v14.0.0
- Backport changed from Mimic, Luminous, Jewel to mimic,luminous
- Pull request ID set to 25976
- Affected Versions deleted (
v0.22, v0.22.1, v0.22.2, v0.22.3, v0.23, v0.23.1, v0.23.2, v0.24, v0.24.1, v0.24.2, v0.24.3, v0.25, v0.25.1, v0.25.2, v0.25.3, v0.26, v0.26.1, v0.27, v0.27.1, v0.28, v0.29, v0.30, v0.31, v0.32, v0.33, v0.34, v0.35, v0.36, v0.37, v0.38, v0.39, v0.40, v0.41, v0.42, v0.43, v0.44, v0.45, v0.46, v0.47, v0.48, v0.49, v0.50, v0.51, v0.52a, v0.53a, v0.53b, v0.53c, v0.54a, v0.54b, v0.55a, v0.55b, v0.55c, v0.55d, v0.56, v0.57a, v0.57b, v0.57c, v0.58, v0.59, v0.60, v0.61 - Cuttlefish, v0.62a, v0.62b, v0.63, v0.64, v0.65, v0.66, v0.67 - Dumpling, v0.67rc, v0.67rc - continued, v0.68, v0.68 - continued, v0.69, v0.70, v0.71, v0.72 Emperor, v0.73, v0.74, v0.75, v0.76a, v0.76b, v0.77, 0.78, 0.79, 0.80rc, 0.80, v0.81, 0.82, 0.83, 0.83 cont., 0.84, 0.84 cont., 0.85, 0.85 cont., 0.86, 0.88, 0.89, 0.90, v.91, v.actually90, v.actually91, v0.92, v0.93 - Last Hammer Sprint, v0.94, v0.95, v9.0.2, v9.0.3, v9.0.4, v9.0.5, v9.0.6, v9.0.7, v9.0.8, v10.0.4, v0.80.10, v0.80.11, v0.80.12, v0.94.10, v0.94.11, v0.94.2, v0.94.3, v0.94.4, v0.94.5, v0.94.6, v0.94.7, v0.94.8, v0.94.9, v10.0.0, v10.1.1, v10.2.0, v10.2.1, v10.2.10, v10.2.11, v10.2.12, v10.2.2, v10.2.3, v10.2.4, v10.2.5, v10.2.6, v10.2.7, v10.2.8, v10.2.9, v11.1.0, v11.2.0, v11.2.1, v11.2.2, v12.0.0, v12.1.0, v12.2.0, v12.2.1, v12.2.10, v12.2.11, v12.2.2, v12.2.3, v12.2.4, v12.2.5, v12.2.6, v12.2.7, v12.2.8, v12.2.9, v13.0.0, v13.2.0, v13.2.1, v13.2.2, v13.2.3, v13.2.4, v13.2.5, v14.0.0, v15.0.0, v9.1.1, v9.2.1, v9.2.2)