Project

General

Profile

Feature #64375

Updated by Samuel Just 3 months ago

TLDR: let's drop this until 2025.    GCC's support is too immature and it's measurably slower. 

 Chaining lambdas as continuations to futures is relatively error prone as the developer must reason carefully about captured reference lifetimes.    C++ coroutines should help with this as a developer can co_await a future without losing the currently scoped variables. 

 Seastar already has support for using seastar::future's with C++ coroutines.    https://github.com/athanatos/ceph/tree/sjust/backburner/sjust/wip-crimson-coroutines adds support to crimson's future wrappers -- including static checking of errorated futures and support for checking interruptions upon resume for interruptible futures. 

 However, I suggest delaying this change until at least 2025 for two reasons: 
 1. The above branch only converts a small portions of the critical IO path in ClientRequest, but in the process I hit three serious code generation bugs with gcc 11.4.1.    One seems to be fixed in 12.2.1, the other two seem to be fixed in 13.2.1.    Still, debugging them was a significant time sink and hitting so many in so little code strongly suggests that gcc's implementation simply isn't reliable yet. 
 2. With workarounds, I was able to do some performance testing with and without those changes.    It seems the the coroutine version is measurably slower (~2%) despite having only converted relatively small portion code.    There are likely improvements to be made, but I do not judge them worth pursuing at this time given the above compiler issues. 

 GCC Bugs: 
 - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98401 
   The specific symptom I observed is the pg param being 
   destructed multiple times resulting in the refcount going 
   rapidly to 0 destroying the PG prematurely. 
 - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102217 
   This one appears to cause the generated code to double-free 
   the awaiter holding the future.    This one seems to be fixed 
   in gcc 13.2.1. 
 - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101244 
   This example isn't precisely as described in the bug, but it seems 
   similar.    It causes the generated code to incorrectly execute 
   process_pg_op unconditionally before the predicate.    It seems to be 
   fixed in gcc 12.2.1. 

 Perf result summary: 

 The below results are 4 concurrent fio rbd writers each with 128 qd 
 (well past saturation).    Results are close enough that it's probable 
 that further work could close the gap, but given the immaturity of 
 the compiler support, it's not worth doing right now.    Results aren't 
 meaningfully different with gcc 13. 

 gcc version 13.2.1 20231205 (Red Hat 13.2.1-6) (GCC) 

 both branches have the following applied to src/seastar/CMakeLists.txt to allow 
 building with gcc 13: 
 <pre> 
 -if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU") 
 -    if (NOT Cxx_Compiler_BZ107852_Free AND CMAKE_CXX_COMPILER_VERSION VERSION_GREATER 13) 
 -      include (CheckGcc107852) 
 -      list (APPEND Seastar_PRIVATE_CXX_FLAGS 
 -        -Wno-error=stringop-overflow 
 -        -Wno-error=array-bounds) 
 -    endif() 
 -endif () 
 - 
 </pre> 

 sjust/wip-crimson-coroutines 4c5681c099752b5e36e3a67859b59362343c1ed0 

                Block_size             4096            4096            4096 
                      Time              100             100             100 
                      Core                2               4               8 
                      Tool          Fio RBD         Fio RBD         Fio RBD 
                   Version          2024029         2024029         2024029 
                    OPtype       Rand Write      Rand Write      Rand Write 
                  CommitID         4c5681c0        4c5681c0        4c5681c0 
                       OSD          Crimson         Crimson         Crimson 
                     Store        Bluestore       Bluestore       Bluestore 
                Thread_num              128             128             128 
                Client_num                4               4               4 
               Latency(ms)       56.6480625        23.11821         9.99745 
           Bandwidth(MB/s)       37.0182656 90.7411251199    157.70769408 
                      IOPS          9037.66        22153.61        38502.87 

 sjust/wip-crimson-coroutines-base e160811c5fec46717a117ac02b6b29609f067233 

                Block_size             4096            4096            4096 
                      Time              100             100             100 
                      Core                2               4               8 
                      Tool          Fio RBD         Fio RBD         Fio RBD 
                   Version          2024029         2024029         2024029 
                    OPtype       Rand Write      Rand Write      Rand Write 
                  CommitID         e160811c        e160811c        e160811c 
                       OSD          Crimson         Crimson         Crimson 
                     Store        Bluestore       Bluestore       Bluestore 
                Thread_num              128             128             128 
                Client_num                4               4               4 
               Latency(ms)         55.26808 22.9293124999 12.9919900000 
           Bandwidth(MB/s)      37.94628608 91.4753433600     161.4054912 
                      IOPS          9264.22        22332.86        39405.64 

 gcc version 11.4.1 20230605 (Red Hat 11.4.1-2) (GCC) 

 About a 1/38 throughput hit 

 No changes, CMakeLists change above not applied. 

 sjust/wip-crimson-coroutines 4c5681c099752b5e36e3a67859b59362343c1ed0 

                Block_size             4096            4096            4096 
                      Time              100             100             100 
                      Core                2               4               8 
                      Tool          Fio RBD         Fio RBD         Fio RBD 
                   Version          2024029         2024029         2024029 
                    OPtype       Rand Write      Rand Write      Rand Write 
                  CommitID         4c5681c0        4c5681c0        4c5681c0 
                       OSD          Crimson         Crimson         Crimson 
                     Store        Bluestore       Bluestore       Bluestore 
                Thread_num              128             128             128 
                Client_num                4               4               4 
               Latency(ms)         57.35153 23.1843599999       9.9319775 
           Bandwidth(MB/s)      36.57537536 90.4843059200    158.11516416 
                      IOPS          8929.54        22090.89        38602.35 

                Block_size             4096            4096            4096 
                      Time              100             100             100 
                      Core                2               4               8 
                      Tool          Fio RBD         Fio RBD         Fio RBD 
                   Version          2024029         2024029         2024029 
                    OPtype       Rand Write      Rand Write      Rand Write 
                  CommitID         4c5681c0        4c5681c0        4c5681c0 
                       OSD          Crimson         Crimson         Crimson 
                     Store        Bluestore       Bluestore       Bluestore 
                Thread_num              128             128             128 
                Client_num                4               4               4 
               Latency(ms)       57.1692875 23.2074850000       13.169925 
           Bandwidth(MB/s)    36.6791475199     90.39757312    159.24107264 
                      IOPS          8954.87         22069.7        38877.21 

                Block_size             4096            4096            4096 
                      Time              100             100             100 
                      Core                2               4               8 
                      Tool          Fio RBD         Fio RBD         Fio RBD 
                   Version          2024029         2024029         2024029 
                    OPtype       Rand Write      Rand Write      Rand Write 
                  CommitID         4c5681c0        4c5681c0        4c5681c0 
                       OSD          Crimson         Crimson         Crimson 
                     Store        Bluestore       Bluestore       Bluestore 
                Thread_num              128             128             128 
                Client_num                4               4               4 
               Latency(ms)       56.8740425       23.188265 13.3796974999 
           Bandwidth(MB/s)      36.87575552     90.49010176    156.84764672 
                      IOPS          9002.89        22092.32        38292.88 

 sjust/wip-crimson-coroutines-base e160811c5fec46717a117ac02b6b29609f067233 

                Block_size             4096            4096            4096 
                      Time              100             100             100 
                      Core                2               4               8 
                      Tool          Fio RBD         Fio RBD         Fio RBD 
                   Version          2024029         2024029         2024029 
                    OPtype       Rand Write      Rand Write      Rand Write 
                  CommitID         e160811c        e160811c        e160811c 
                       OSD          Crimson         Crimson         Crimson 
                     Store        Bluestore       Bluestore       Bluestore 
                Thread_num              128             128             128 
                Client_num                4               4               4 
               Latency(ms)    55.7295474999 22.8178975000        12.99293 
           Bandwidth(MB/s)      37.63249152     91.93051136    161.46226176 
                      IOPS          9187.62        22443.99        39419.51 

                Block_size             4096            4096            4096 
                      Time              100             100             100 
                      Core                2               4               8 
                      Tool          Fio RBD         Fio RBD         Fio RBD 
                   Version          2024029         2024029         2024029 
                    OPtype       Rand Write      Rand Write      Rand Write 
                  CommitID         e160811c        e160811c        e160811c 
                       OSD          Crimson         Crimson         Crimson 
                     Store        Bluestore       Bluestore       Bluestore 
                Thread_num              128             128             128 
                Client_num                4               4               4 
               Latency(ms)         55.38967 23.1957575000      12.9069425 
           Bandwidth(MB/s)    37.8617856000     90.47964672     162.6132992 
                      IOPS           9243.6        22089.77        39700.53

Back