Bug #57536
opencrimson: OSDMapGate is updated in wrong order
0%
Description
The out-of-order `OSDMapGate::current` updates ==============================================
Local replication
-----------------
```
[rzarzynski@o06 build]$ MDS=0 MGR=1 OSD=3 MON=2 ../src/vstart.sh n --crimson --seastore --nolockdep --nodaemon --redirect-output --without-dashboard --no-restart -o "debug_objclass=20" -o "debug_osd=20" -o "d$] Global test environment set-up.
bug_none=20"
...
[rzarzynski@o06 build]$ CRIMSON_COMPAT="true" ceph_test_rados_api_aio --debug-osd=20 --log-to-stderr=true
Running main() from gmock_main.cc
[==========] Running 40 tests from 2 test suites.
[---------
[----------] 24 tests from LibRadosAio
[ RUN ] LibRadosAio.TooBig
[ OK ] LibRadosAio.TooBig (2866 ms)
[ RUN ] LibRadosAio.SimpleWrite
^Bp^C
```
--------
- Logs
```
DEBUG 2022-09-14 09:07:31,469 [shard 0] osd - pg_advance_map(id=414, detail=PGAdvanceMap(pg=3.11 from=18 to=19 do_init)): start
...
DEBUG 2022-09-14 09:07:32,394 [shard 0] osd - pg_advance_map(id=506, detail=PGAdvanceMap(pg=3.11 from=19 to=20)): start
...
DEBUG 2022-09-14 09:07:33,399 [shard 0] osd - pg_advance_map(id=602, detail=PGAdvanceMap(pg=3.11 from=20 to=21)): start
...
DEBUG 2022-09-14 09:07:33,412 [shard 0] osd - pg_advance_map(id=602, detail=PGAdvanceMap(pg=3.11 from=20 to=21)): complete
...
DEBUG 2022-09-14 09:07:33,898 [shard 0] osd - pg_advance_map(id=414, detail=PGAdvanceMap(pg=3.11 from=18 to=19 do_init)): complete
...
DEBUG 2022-09-14 09:07:33,901 [shard 0] osd - pg_advance_map(id=506, detail=PGAdvanceMap(pg=3.11 from=19 to=20)): complete
DEBUG 2022-09-14 09:07:33,901 [shard 0] osd - got_map(20), current(21)
WARN 2022-09-14 09:07:33,901 [shard 0] osd - got_map(20) <= current(21), ignoring
ERROR 2022-09-14 09:07:33,901 [shard 0] none - ../src/crimson/osd/osdmap_gate.cc:60 : In function 'void crimson::osd::OSDMapGate<OSDMapGateTypeV>::got_map(epoch_t) [with crimson::osd::OSDMapGateType OSDMapGateTypeV = crimson::osd::OSDMapGateType::OSD; epoch_t = unsigned int]', abort(%s)
```
- Backtrace
```
0# gsignal in /lib64/libc.so.6
1# abort in /lib64/libc.so.6
2# ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/util/log.hh:106
3# crimson::osd::OSDMapGate<(crimson::osd::OSDMapGateType)0>::got_map(unsigned int) at /opt/rh/gcc-toolset-11/root/usr/include/c++/11/bits/basic_string.h:672
4# operator() at /home/rzarzynski/ceph1/build/../src/crimson/osd/shard_services.cc:641
5# ZN7seastar20noncopyable_functionIFNS_6futureIvEEvEE17direct_vtable_forIZNS2_4thenIZN7crimson3osd9CoreState20broadcast_map_to_pgsERNS8_14PGShardManagerERNS8_13ShardServicesEjEUlvE0_S2_EET0_OT_EUlDpOT_E_E4callEPKS4 at /home/rzarzynski/ceph1/build/../src/seastar/include/seastar/util/noncopyable_function.hh:125
```
`crimson::osd::OSDMapGateType)0` means the global, per-OSD instance is affected.
Hypothesis #1
-------------
```cpp
seastar::future<> CoreState::broadcast_map_to_pgs(
PGShardManager &shard_manager,
ShardServices &shard_services,
epoch_t epoch)
{
auto &pgs = pg_map.get_pgs();
return seastar::parallel_for_each(
pgs.begin(), pgs.end(),
[=, &shard_manager, &shard_services](auto& pg) {
return shard_services.start_operation<PGAdvanceMap>(
shard_manager, pg.second, pg.second->get_osdmap_epoch(), epoch,
PeeringCtx{}, false).second;
}).then([epoch, this] {
osdmap_gate.got_map(epoch);
return seastar::make_ready_future();
});
}
```
The code above assumes `PGAdvanceMap` operation completes in-order. However, this isn't the case as, just after activating the most recent map, `PGAdvanceMap::start()` calls `handle.exit()`:
```cpp
seastar::future<> PGAdvanceMap::start()
{
// ...
return enter_stage<>(
pg->peering_request_pg_pipeline.process
).then([this] {
// ...
return seastar::do_for_each(
boost::make_counting_iterator(from + 1),
boost::make_counting_iterator(to + 1),
[this](epoch_t next_epoch) {
return shard_manager.get_map(next_epoch).then(
[this] (cached_map_t&& next_map) {
pg->handle_advance_map(next_map, rctx);
});
}).then([this] {
pg->handle_activate_map(rctx);
handle.exit();
// ...
}).then_unpack([this] {
return shard_manager.get_shard_services().send_pg_temp();
});
}).then([this, ref=std::move(ref)] {
logger().debug("{}: complete", *this);
});
}
```
No data to display