Feature #2611
closedmon: Single-Paxos
100%
Description
The ceph-mon is (roughly) composed by a Monitor class, responsible for all things monitor-ish, and several monitor services, each one responsible to handle a specific component (e.g., the OSDMonitor will deal with things related with the OSD, while the AuthMonitor will deal with things related with Auth on the cluster).
Each one of these services (actually extending the PaxosService class, but hereupon simply called as 'service'), have their own instance of a Paxos class, and access to a shared MonitorStore.
Although this works, there are several issues with it:
- Each service ties its map versions to the Paxos versions, creating a dependency that simply should not exist.
- There are as many Paxos instances as running services, but we only purpose new values one at a time (a lock on the Monitor makes sure of it). This means that having one Paxos or a thousand of them would bear the same result.
- Adding more services becomes an hazardous and error prone task.
- It's impossible to batch proposals from multiple services in order to reduce proposal time or network activity.
Our main objective is to make the Monitor use one single Paxos instance, shared across all the services.
Furthermore, the usage of the MonitorStore carries a whole set of problems by itself:
- Its interface is far from the most intuitive.
- It's a file-based store with an unorthodox way of guaranteeing consistency in case of failure.
- It does not provide a simple interface guaranteeing that the store will transit from a consistent state onto another consistent state if we apply more than one operation (a problem that would not exist had we transactions like in the ObjectStore or the KeyValueDBStore).
- And finally, we already have interfaces and classes on Ceph that provide storage mechanisms meeting the needs of the Monitor, and there is no need to keep the MonitorStore around.
For these reasons, the MonitorStore will be moved onto a Key/Value Store, backed up by the LevelDBStore class already available on Ceph.