Guest User

Untitled

a guest
Sep 23rd, 2020
1,830
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 8.63 KB | None | 0 0
  1. I'm interested in trying to see if I can provide better networking services than many of the current servers, which were often written in the late 1900s and optimised for networks (and single core machines) of the time.
  2. I feel that in 2020, it would be irresponsible to write new services that you expect to be exposed to the raw internet in memory unsafe languages such as C or C++, so at the beginning of the pandemic I decided to learn rust.
  3.  
  4. I was surprised to discover that I appear to be trail blazing here. For example, to write high quality datagram based services (eg UDP), you need metadata about the destination IP so you can reply from the same IP. (There are technically ways around this, but they have differing trade offs). I discovered that this (and other critical) APIs have not yet been wrapped for rust. I've been slowly contributing fixes where I can (and where my skill lets me write a clean API).
  5.  
  6. If rust is interested in the safe advanced network services, then I suggest:
  7.  
  8. # Wrapping critical missing APIs
  9. APIs such as async sendmsg / recvmsg, (which are complicated by cmsg/ControlMessage, and complicated lifetime issues), need to be easily usable. They need to be wrapped into idiomatic rust.
  10.  
  11. At the moment writing advanced networking applications tends to involve using a lot of `unsafe` code to wrap things. The entire point of writing in rust rather than C/C++ is the safety afforded to you by the language. Having code bases with large amounts of `unsafe` code in them, defeats the purpose. Obviously networking libraries are going to involve unsafe code, but the applications should be shielded from such.
  12.  
  13. One example of a critically missing API is async sendmsg/recvmsg, without the control messages you cannot implement more than just trivial datagram services. A wide variety of sockopts (eg see RFC3493 and RFC3542 list a bunch of APIs for dealing with v6 that mostly don't provide good APIs) are missing and need to be wrapped.
  14.  
  15. # Performance and latency
  16. Networking APIs are generally throughput (performance) heavy, and latency sensitive. Fortunately rust is in general extremely good at solving these kinds of problems.
  17.  
  18. Very high performance networking applications are often memory bandwidth limited, and hence the push for "zero copy". Memory initialisation is also going to be critically important (https://github.com/rust-lang/rfcs/pull/2930) here.
  19.  
  20. The rust community doesn't seem to have figured out how they want to best handle cmsg's for APIs like recvmsg/sendmsg in a clean way yet. The nix crate has ControlMessage and OwnedControlMessage, but the latter I believe does heap allocations which are likely to slow down high performance networking applications. Some proper solution for dealing with this would probably eventually be necessary, and ideally eventually merged into std: to sit beside send/recv.
  21.  
  22. tokio appears to be doing a fantastic job here for most of the low level infrastructure. It will use whatever high performance event API is available without the application developer needing to know what's going on. tokio embraces rust's fearless concurrency leading to applications that can use all the cores on a modern machine in an efficient way. This has been fantastic compared to other languages.
  23.  
  24. # Async and traits
  25. Having written a bunch of async networking code, not being able to use async functions within a trait in the obvious idiomatic way is highly frustrating, and leads to ugly workarounds to avoid using traits where they would be natural. I believe that currently the work on GATs (https://github.com/rust-lang/rust/issues/44265) is what is blocking this.
  26.  
  27. # Abstracting away the details
  28. Many of the advanced networking APIs differ between platforms, and, even more frustratingly differ between v4 and v6 sockets on the same operating system. Having a layer developed that abstracts away many of these differences would be extremely useful.
  29.  
  30. Examples here is getting the source IP, destination IP and the received interface for a datagram, setting the TTL (for traceroute, multicast, and other protocols), getting the timestamp that the datagram was received using the highest precision mechanism available for the platform, being able to receive more detailed error information. libc has `getifaddrs(3)`, having a well regarded crate that portably abstracted away dealing with networking interfaces, routing tables and neighbour/arp tables over freebsd, linux, windows, macos, etc...
  31.  
  32. Ideally there would be a "datagram" API that handles buffer management (see the uninitialised memory discussion above), and uses recvmsg/recvmmsg/io_uring/mmap()'d/whatever sockets under the hood to give the best performance/latency, depending on what the underlying operating system supports. Maybe something like an iterator API? Or a DatagramStream? This is similarly needed for sending, using sendmsg/sendmmsg/io_uring/mmap() sockets.
  33.  
  34. # Some way to quickly and safely parse packets in a zerocopy manner.
  35. In C/C++, you read frames into a buffer, then tend to have some idiomatic code like:
  36. ```
  37. struct protocol *read_protocol(void *buffer, size_t len) {
  38. if (len < sizeof(struct protocol))
  39. return NULL;
  40. return (struct protocol *)buffer;
  41. }
  42.  
  43. ...
  44. struct protocol *msg = read_protocol(buffer, len);
  45. if (!msg) return;
  46. printf("ver: %d type: %d\n", ntohs(msg->ver), (int)msg->type);
  47. ```
  48.  
  49. This kind of casting a buffer to a packet layout can be done in rust, but is not very idiomatic, and requires even more `unsafe` code for end applications. Maybe a crate that has a macro that takes a struct definition, and generates the correct code to do this would be useful? Perhaps implementing the serde APIs? My ability to write complex macros like this is not quite there yet.
  50.  
  51. Rusts great support for endianness with u8/u16/u32/u64/u128 all supporting .to_be() (and .to_le(), and .to_ne_bytes()) is fantastic, but a common bug in networking protocols is accidentally forgetting to byte swap, or byte swapping too many times. I suspect it would be useful to have {be,le}{16,32,64,128} types that take care of the byte swapping correctly when converting between types. A function that takes a be32 cannot be accidentally confused with one that takes a u32 (native), or le32 value. I've been meaning to experiment with this, but haven't gotten to it yet.
  52.  
  53. # Protocol crates.
  54. Languages that get the most use for networking are languages that have good libraries that provide reasonable quality implementations of protocols. Rust has some well known networking crates to deal with HTTP/HTTPS, but there is more to networking than just webpages. Python, for instance, has their "twisted matrix" library which provides a bunch of implementations for various networking protocols in a consistent style that are easily extended. This makes it easy to decide to write, say, a new IMAP server, and just have to implement the new business logic, giving you a robust solution in only a few hours of work. (In fact, python's batteries included standard library contains many basic protocol implementations which mean that simple things can be done quickly and easily)
  55.  
  56. Currently there are crates that exist for many of these protocols, but they're not tied together in a cohesive way. They don't reuse common types, leading application developers that want to speak multiple protocols to write a lot of annoying "copy this type to this other but effectively equivalent type" code. Many do not yet support async.
  57.  
  58. # Other
  59. Networking with rust also runs into other more generic problems, for instance efflorescence to do with errors, but these are not networking specific problems.
  60.  
  61. # Summary
  62. Currently writing networking applications in rust tends to lean surprisingly heavily on `unsafe` code, either to wrap missing interfaces, work with uninitialised memory, or to provide required zero copy semantics which muddies many of these benefits.
  63.  
  64. Rust's async is relatively new, and the ecosystem around it hasn't quite caught up yet, many key libraries are not async yet, and missing GATs is painful.
  65.  
  66. If rust truly wants to be the "gold standard" for writing high performance, advanced networking applications, then there are a lot of missing crates needed to help provide the high level abstractions.
  67.  
  68. However, I've found rust to be a joy to write networking services in. async makes complex control flow easy to manage, fearless concurrency means I can easily make full use of the hardware, rust's high level abstractions make it much easier to write complicated algorithms to handle networking code efficiently, and most importantly rust's safety guarantees mean I can be confident that I'm not accidentally introducing potentially security critical memory safety issues.
Add Comment
Please, Sign In to add comment