Bug #36686
closedosd: pg log hard limit can cause crash during upgrade
0%
Description
During an upgrade from an earlier version, a primary running the new code will send a trim_to value to a replica that triggers an assert in the old code in certain circumstances. Namely, when the pg log is being trimmed beyond info.last_clean, during backfill.
This can be triggered by the luminous-x:stress-split suite with a small pg log to force backfill. In this run upgrading from 12.2.5 (no hard limit) to mimic (hard limit present) with osd_min_pg_log_entries = 1 and osd_max_pg_log_entries = 2 to force more backfilling, we hit this assert:
/builddir/build/BUILD/ceph-12.2.5/src/osd/PGLog.cc: 170: FAILED assert(trim_to <= info.last_complete) ceph version 12.2.5-42.0.TEST.bz1636267.el7cp (559ef7e0c955a21506efea93cfccafcf153e74b7) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x7f64c7427db0] 2: (PGLog::trim(eversion_t, pg_info_t&)+0x26f) [0x7f64c6fc4b8f] 3: (PG::append_log(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&, eversion_t, eversion_t, ObjectStore::Transaction&, bool)+0x36d) [0x7f64c6f4f23d] 4: (PrimaryLogPG::log_operation(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&, boost::optional<pg_hit_set_history_t> const&, eversion_t const&, eversion_t const&, bool, ObjectStore::Transaction&)+0x74) [0x7f64c7066344] 5: (ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace const&, Context*)+0x31a) [0x7f64c718eaea] 6: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327) [0x7f64c719f6e7] 7: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x7f64c709fdb0] 8: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x59c) [0x7f64c700b2cc] 9: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9) [0x7f64c6e8e9c9] 10: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x7f64c7111ea7] 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce) [0x7f64c6ebdace] 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839) [0x7f64c742d8c9] 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f64c742f860] 14: (()+0x7dc5) [0x7f64c41dddc5] 15: (clone()+0x6d) [0x7f64c32d276d]
We can avoid this by adding an osdmap flag to enable the hard limit, dependent on OSDs reporting a pg log hard limit feature bit, similar to how we handled recovery deletes.
A workaround for users is to upgrade and restart all OSDs to a version with the pg hard limit, or only upgrade when all PGs are active+clean.
Updated by Neha Ojha over 5 years ago
The immediate fix is to revert this for luminous before 12.2.9: https://github.com/ceph/ceph/pull/24903
Updated by Nathan Cutler over 5 years ago
Neha, 12.2.9 has already been cut, so we'll need to expedite 12.2.10 to push the revert out to users.
Updated by Nathan Cutler over 5 years ago
Also, is this bug reproducible in master and mimic as well? If not, the Backport field should probably be modified...
Nevermind, I should have read the description before opening my mouth.
Updated by Yuri Weinstein over 5 years ago
Updated by Nathan Cutler over 5 years ago
So, the luminous revert was merged. Neha, will there be a mimic revert as well? Since the pg hard limit patches are present in 13.2.2, will we need to revert them before we release 13.2.3?
Updated by Neha Ojha over 5 years ago
Quoting my reply to ceph-devel for reference:
"Nathan, I don't think we want to revert it for 13.2.2.
This is because the pg log hard limit feature currently doesn't seem
to work well in a partial upgrade, recovery/backfill scenario. So,
even if we do revert it in 13.2.3, this still leaves us a chance of
going into a split scenario, where some osds in the field, running
13.2.2(with hard limit code) and others on 13.2.3(without the code),
may encounter http://tracker.ceph.com/issues/36686.
Therefore, users who have succesfully upgraded to 13.2.2, shouldn't be
at any risk.
For users trying to upgrade to a version >= 13.2.2, I am going to make
a note of this issue and add the suggested workaround in Pending
Release Notes for mimic."
Updated by Yuri Weinstein over 5 years ago
Updated by Nathan Cutler over 5 years ago
- Has duplicate Bug #36706: Ceph ECBackend: assert fail at PGLog::trim added
Updated by Alexander Morozov over 5 years ago
Nathan Cutler wrote:
Neha, 12.2.9 has already been cut, so we'll need to expedite 12.2.10 to push the revert out to users.
Any ETA for the fix?
Updated by Nathan Cutler over 5 years ago
Alexander Morozov wrote:
Any ETA for the fix?
Did you mean ETA for 12.2.10? Luminous v12.2.10 was released on November 27, 2018: https://ceph.com/releases/v12-2-10-luminous-released/
Updated by Alexander Morozov over 5 years ago
Nathan Cutler wrote:
Alexander Morozov wrote:
Any ETA for the fix?
Did you mean ETA for 12.2.10? Luminous v12.2.10 was released on November 27, 2018: https://ceph.com/releases/v12-2-10-luminous-released/
But it's said "If you already successfully upgraded to v12.2.9, you should not upgrade to v12.2.10, but rather wait for a release in which http://tracker.ceph.com/issues/36686 is addressed." http://docs.ceph.com/docs/master/releases/luminous/ .
So I ask about 12.2.11 or 12.3.0 release dates.
Updated by Oliver Freyermuth over 5 years ago
Let me extend that question with:
What's the clean upgrade path for those on 12.2.8 or 12.2.10 (and wanting to upgrade to mimic) or those on 13.2.1?
The release notes of 13.2.2 state "we are working on a clean upgrade path for this feature".
- Is this planned for 12.2.11 (so 12.2.8 and 12.2.10 users can upgrade to that and then jump to >=13.2.2)?
- Is this planned for 13.2.3 (so anybody from any previous release can cleanly upgrade to that)?
Updated by Neha Ojha over 5 years ago
Oliver Freyermuth wrote:
Let me extend that question with:
Hence the questions:
What's the clean upgrade path for those on 12.2.8 or 12.2.10 (and wanting to upgrade to mimic) or those on 13.2.1?
The release notes of 13.2.2 state "we are working on a clean upgrade path for this feature".
- Is this planned for 12.2.11 (so 12.2.8 and 12.2.10 users can upgrade to that and then jump to >=13.2.2)?
Yes, it is planned for 12.2.11.
- Is this planned for 13.2.3 (so anybody from any previous release can cleanly upgrade to that)?
It should make it in 13.2.4, since we are almost ready with 13.2.3
Updated by Neha Ojha over 5 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 25816
Updated by Neha Ojha over 5 years ago
- Related to Bug #37803: osd/PGLog.cc: 170: FAILED assert(trim_to <= info.last_complete) added
Updated by Neha Ojha over 5 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Neha Ojha over 5 years ago
Nathan, can you please help generate backport tracker tickets for this?
Updated by Nathan Cutler over 5 years ago
- Copied to Backport #37902: mimic: osd: pg log hard limit can cause crash during upgrade added
Updated by Nathan Cutler over 5 years ago
- Copied to Backport #37903: luminous: osd: pg log hard limit can cause crash during upgrade added
Updated by Nathan Cutler about 5 years ago
- Status changed from Pending Backport to Resolved