Rados - metadata-only journal mode » History » Revision 3
Revision 2 (Li Wang, 06/10/2015 09:13 AM) → Revision 3/16 (Li Wang, 06/10/2015 09:17 AM)
h1. Rados - metadata-only journal mode *Summary* This is for metadata-only journal mode. An important usage of Ceph is to integrate with cloud computing platform to provide the storage for VM images and instances. In such scenario, qemu maps RBD as virtual block devices, i.e., disks to a VM, and the guest operating system will format the disks and create file systems on them. In this case, RBD mostly resembles a 'dumb' disk. In other words, it is enough for RBD to implement exactly the semantics of a disk controller driver. Typically, the disk controller itself does not provide a transactional mechanism to ensure a write operation done atomically. Instead, it is up to the file system, who manages the disk, to adopt some techniques such as journaling to prevent inconsistency, if necessary. Consequently, RBD does not need to provide the atomic mechanism to ensure a data write operation done atomically, since the guest file system will guarantee that its write operations to RBD will remain consistent by using journaling if needed. Another scenario is for the cache tiering, while cache pool has already provided the durability, when dirty objects are written back, they theoretically need not go through the journaling process of base pool, since the flusher could replay the write operation. These motivate us to implement a new journal mode, metadata-only journal mode, which resembles the data=ordered journal mode in ext4. With such journal mode is on, object data are written directly to their ultimate location, when data written finished, metadata are written into the journal, then the write returns to caller. This will avoid the double-write penalty of object data due to the WRITE-AHEAD-LOGGING, potentially greatly improve the RBD and cache tiering performance. *Owners* Li Wang (liwang@ubuntukylin.com) Name (Affiliation) Name *Interested Parties* If you are interested in contributing to this blueprint, or want to be a "speaker" during the Summit session, list your name here. Name (Affiliation) Name (Affiliation) Name *Current Status* Please describe the current status of Ceph as it relates to this blueprint. Is there something that this replaces? Are there current features that are related? *Detailed Description* This is the big one! Please provide a detailed description for the proposed change. Where appropriate, include your architectural approach, a list of systems involved, important consequences, and issues that are still unresolved. *Work items* This section should contain a list of work tasks created by this blueprint. Please include engineering tasks as well as related build/release and documentation work. If this blueprint requires cleanup of deprecated features, please list those tasks as well. *Coding tasks* Task 1 Task 2 Task 3 *Build / release tasks* Task 1 Task 2 Task 3 *Documentation tasks* Task 1 Task 2 Task 3 *Deprecation tasks* Task 1 Task 2 Task 3