Actions
Msgr - implement infiniband support via rsockets¶
Summary¶
Add Infinibad support by using the rsockets library.
Owners¶
- Name (Affiliation)
Interested Parties¶
- Kasper Dieter (Fujitsu)
- Andreas Bluemle (itxperts.de)
- Sage Weil (Inktank)
- Mark Nelson (Inktank)
- Danny Al-Gaaf
- Andrey Korolyov (flops.ru)
Current Status¶
The SimpleMessenger (msg/*) module handles all network communication in Ceph and is currently based on the normal sockets API using TCP.Network addresses are currently sorted in entity_addr_t, a wrapper around struct sockaddr_storage (80 bytes, IIRC), which is supposed to be big enough for any network address.
The SimpleMessenger code has a long and storied lineage, is multithreaded (2 threads per socket!), and is difficult to follow--both because of the code and because of the complexity of the protocol. Rewriting the whole thing around an explicit state machine using thread pools and poll(2) has be on the wish list for a long time. I do not think that it is a blocker for rsockets support, although it might be nice to do it at the same time.
My currently reading of rsockets() capabilities is that the endpoint addresses look like ipv4/v6 addrs, but the peers both use the r*() calls and negotiate an rsockets session. This means that we need to distinguish between IP endpoints and IP+rsockets endpoints. This is probably simplest to do by modifying entity_addr_t and including a special address type. entity_addr_t::type is current always == 0, so we can defined a 1 (or whatever) for rsockets.
Almost all socket calls are confined to Accepter.cc (which is small) and Pipe.cc (which is not). Most actual socket calls use a handful of wrappers:
- tcp_read
- tcp_read_wait
- tcp_read_nonblocking
- tcp_write
- shutdown_socket
- do_sendmsg
- getpeername
- setsockopt
- socket
- connect
- close
Once these are all wrapped, a simple conditional on the peer address type (entity_addr_t::get_type()) can conditionally use the normal socket syscall or the equivalent rsockets call.
Detailed Description¶
rsockets is supposed to follow the normal sockes API very closely, making it easy to use in existing applications.
http://linux.die.net/man/7/rsocket
I hope that getting a prototype working is as simple as creating a new address type and putting some conditionals around all of the socket calls in msg/Pipe.cc and msg/Accepter.cc.
Work items¶
Coding tasks¶
- msg/msg_types: add rsockets address support to entity_addr_t via the type field. add accessors, update operator<<()
- msg/Pipe: add wrappers for unwrapped socket calls
- msg/Accepter, config: add conditional bind to either socket or rsocket
- msg/Pipe: update wrappers to either use socket or rsocket call based on peer address type
- profit!
Build / release tasks¶
- add library detection to configure.ac
- conditionally compile the rsockets support
Documentation tasks¶
- write howto document
Updated by Jessica Mack almost 9 years ago · 1 revisions