SHARE
TWEET

Untitled

a guest Sep 15th, 2019 126 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. # Rust in Large Organizations
  2.  
  3. **Initially taken by Niko Matsakis and lightly edited by Ryan Levick**
  4.  
  5. ## Agenda
  6.  
  7. - Introductions
  8. - Cargo inside large build systems
  9. - FFI
  10. - Foundations and financial support
  11.  
  12. ## Attending
  13.  
  14. - Joe, Microsoft, Seattle Rust Meetup
  15. - Tom at Mozilla, using Rust for sync
  16. - Lena at Mozilla, sync storage etc
  17. - Jack Moffit at FB, Libra team
  18. - Brian Anderson at Pingcap
  19. - acrichto
  20. - erickt
  21. - dtolnay, David Tolnay
  22. - Raj Vengalil, Azure IoT
  23. - cuviper, Redhat
  24. - Rain, FB
  25. - Jeremy F, FB
  26. - Manish
  27. - Ben, Google
  28. - Philip, Cumulo, Rust dev tools + infra
  29. - Remy, Cumulo
  30. - Sebastian, MS, pushing for Rust adoption from sec pov
  31. - Thomas Ekerd, MS, site reliability engineer
  32. - James, MS
  33. - Brandom Williams, FB
  34. - JR, Mozilla backend services
  35. - Phil
  36. - Will, crash ingestion mozilla
  37. - Stjepan, Ferrous system
  38.  
  39. ## cargo
  40.  
  41. - FB dev env -- backend services repo -- is mostly C++
  42.   and Java. Very polyglot environment. Glued together with Buck,
  43.   FB's Bazel.
  44.   - Buck: Language agnostic. Supports Rust.
  45.   - rustc drops in quite nicely, basically equivalent to C++ compiler.
  46.   - wanted to use cargo but it just does too much to fit in
  47.   - need to delineate parts of cargo that are desired with those that conflict with Buck
  48.   - ecosys is big advantage for Rust but hard to separate from cargo
  49.   - current scheme:
  50.     - big cargo.toml including all the things used in internal repo
  51.     - cargo builds artifacts that are presented to buck
  52.     - buck can link against those
  53.     - reasonably successful
  54.     - but approaching 700 crates in transitive dep graph, getting very cumbersome to rebuild etc
  55.     - plus pinned to a specific version of compiler (prebuilt artifacts)
  56.     - works ok but build.rs build scripts are a big complication
  57.   - specific cargo pain points:
  58.     - build scripts
  59.     - "features" feature
  60.     - a lot of crates don't use features the way they're intended -- they're used for exclusive A or B choices
  61.       - this creates the possibility to break the build
  62.       - need some sort of "cfg" feature that represents forks of a crate
  63. - Google does a similar thing for fuschia
  64.   - cargo builds 3rd party artifacts, normal build consumes those
  65.   - problems:
  66.     - handful of 3rd party artifacts depend on things built in tree
  67.     - want to be able to do partial builds, e.g. w/o a feature, or just for some targets
  68.     - developing for a new OS, so we compile some code for host, some for target
  69.     - presently do 2 full builds, but it's a pain
  70.     - don't have as much control over the flags getting passed to rustc as we'd like
  71.     - dep flags + linker flags aren't as specific as we need to distribute deps that are needed for indiv targets
  72.     - prototype using "cargo raise" (use "gn" (from Chrome) to generate ninja files)
  73.     - based on a modification of cargo raise that generates bazel build files
  74.       - has its own handling of build.rs stuff
  75.       - rather than outputting build files, it outputs a json format that could be the basis for the proposed "cargo build plans" feature
  76.         - would be good to know what inputs etc are needed, how this would fit for Buck
  77.         - can Buck consume internal files?
  78.   - gn is aware of the concept of a Rust target
  79. - cumulo build system:
  80.   - doesn't use cargo, invokes rustc directly
  81.   - cargo just builds json
  82.   - build all deps as shared libraries, whether or not they want that
  83.     - `.so` libraries, `.rmeta` files
  84.     - hits a lot of problems
  85.   - ran into problems, notably lack of support for build.rs -- have to reimpl cargo
  86.   - building for 2 different targets
  87.     - have own platform
  88.     - linux target for procedural macros
  89.   - need sometimes to pass flags that are target specific, build a target config map
  90.   - would prefer to use cargo
  91. - does cargo raise support build.rs?
  92.   - has some builtin support for build.rs?
  93.   - not automatic: you declare purpose of build.rs
  94.   - things that do rustc version detection?
  95.   - sometimes you want to (e.g.) disable build.rs that supply native deps which come from bazel
  96. - why can't you run build.rs as part of the build tool?
  97.   - fundamental problems:
  98.     - no declared inputs, no declared outputs
  99.     - buck/bazel etc has to know what files the build script is consuming, producing etc
  100.   - also, they are arbitrary execution, which can be a security concern
  101.     - proc macros have some similar concerns.
  102.       - e.g., pest which looks at cargo source dir env variable and finds your grammar def'n file
  103.         - doesn't fit well
  104. - one thing that was discussed years ago:
  105.   - capability system for build.rs that restrict what scripts can do
  106.   - e.g., read from this directory, write to that one
  107.   - cargo can then audit/sandbox to enforce said rules
  108.     - run build script in a sandbox
  109.       - e.g. crossvm has an impl of this inside of chrome; all crossvm devices run in their own jail
  110.     - nontrivial engineering effort
  111.   - could do at a higher level, sandbox
  112. - jeremy: build scripts classified into 3 or 4 distinct types, is this complete?
  113.   - doing codegen. read a file, bindgen, etc
  114.   - gateway to some other library, using pkgconfig or something to find the library, or they build it from source
  115.   - feature detection on rustc
  116.   - "scary ones" -- database reads, timestamps
  117. - plausibly could address those use cases in other ways
  118.   - feature detection is an obvious one, e.g. we had an rfc for compiler versions
  119. - version compat is a common thing
  120. - what version of rust are people using?
  121.   - stable
  122.   - "stableish" -- bootstrap
  123.   - nightly
  124. - who here is using toolchains distributed by rust?
  125.   - ms (partially), mozilla, libra
  126. - why a custom toolchain?
  127.   - config.toml tweaks
  128.     - use clang's version of some unwinding code
  129.     - custom linker
  130.     - panic=abort
  131.   - custom targets
  132.   - compliance reasons (wanting to build from source for security reasons)
  133. - bootstrapping + compliance
  134.   - where to get initial rust version?
  135.   - several attempts:
  136.     - most successful is using mrustc at version 1.22 and building from there
  137.     - ms, google did that
  138.   - is there a possibility of long term drift?
  139.     - builds are not *quite* reproducible at present, but almost
  140.   - was a point where build w/ mrustc + build with toolchain had non-matching hashes
  141.     - might have to tweak the paths
  142.     - in principle it can be done, should maybe prioritize it
  143. - maybe have an approved "how to bootstrap from C" documentation
  144. - specific reason fb builds from source:
  145.   - want to always have the option to apply a local patch
  146.   - don't want to get stuck with a "we must have this patch yesterday" scenario and have to figure out how to apply patch then
  147. - in most cases, also building llvm, want to share llvm for cross-lang LTO
  148.   - must have a newer LLVM than what rust ships with
  149. - some folks have cross-lang LTO working
  150.   - but rustc doesn't want to produce bitcode files
  151.   - pass the linker `/bin/echo`
  152. - pgo -- coming soon
  153. - fb uses after the fact binary rewriting
  154. - splitting out linker was a potential change to rustc or cargo that google wants
  155. - would be interesting to know "here is what must be passed to gcc to successfully link"
  156. - another option: give a python script as the linker
  157.   - turns out servo does it, too
  158. - show of hands survey:
  159.   - "who is interested in a common backend for 'those things'"
  160.     - nobody knows what that means
  161. - buck needs a "fully specified dep dag", seems like a common thing for other build systems
  162.   - seems like we have to do a few cases to work out the general rules first
  163. - rudimentary cargo build plan support:
  164.   - gives a dag of rustc executions
  165.   - but it's too low level for buck, also bazel
  166. - pressure: every once in a while people propose "rewriting cargo.toml" into the tree
  167.   - so far resisted that
  168.   - a possible outcome buck has thought of:
  169.     - buck support for cargo.toml
  170.     - ton of code that's open source for people (natch) don't want to build w/ buck out of tree
  171.     - want ability to simultaneously maintain buck/cargo support
  172.     - currently done by hand and horrible
  173.     - internally even people want this for mac/win builds which buck doesn't support
  174.     - google w/ gn does something similar, keeps cargo.toml in order to upstream it
  175.       - in some cases can generate a cargo.toml file programatically
  176.       - also imp't for IDE support
  177. - IDE support
  178.   - RLS kind of working with buck
  179.   - knowing laughter :)
  180.   - problematic assumptions: e.g., searching the filesystem for cargo.toml, but it's millions of files
  181.   - symptom of a larger thing
  182.     - cargo is designed for managing rust code
  183.     - assumes source tree is mostly rust code
  184.     - but often rust is embedded in a large source tree with tons of non-rust
  185.       - so having some "root for all rust code" where you search below is problematic
  186.     - top-level directory not gonna work
  187.       - always having to create artificial "root" directories
  188.   - rust-analyzer avoids this by not baking cargo in as deeply
  189.     - but still has this "top level directory" model that contains all the rust code which means a small amount of rust amongst everything else
  190. - generating a cargo.toml for 1 project works well, but when you have multiple targets that interact
  191. - cumulo has a ton of C and Rust code that must be all combined into one big final artifact
  192.   - IDE support that avoids cargo is a must
  193.   - current state of the art: ctags
  194. - cramertj: cargo.toml is basically the intermediate repr for specifying deps
  195.   - are there other things one might want?
  196.   - build system has its own custom language to do that description
  197.     - can use that to generate cargo.toml files though for IDE etc
  198.       - what changes might one want in a "non-cargo IDE language"?
  199.         - maybe cargo would work fine
  200. - manish: does this also cause problems for clippy and rustfmt?
  201.   - cargo.toml is also useful for this
  202. - who uses clippy? most folks
  203. - rustfmt? most folks
  204.   - fb invokes it on individual files for that
  205. - libra uses cargo to build
  206.   - "cacheability" (sccache) has gotten worse over time
  207.   - procedural macros aren't getting cached (dylibs)
  208.   - are other people doing anything with this?
  209.   - ff has a distributed cache in the office
  210.   - (buck does caching of everything)
  211.     - native deps? also integrated into buck
  212.     - assume that if a C dep changes, rust must be rebuilt?
  213.     - `-lnative` is not very well-scoped (just to a directory, not specific libs)
  214.     - problem: can't cache link steps as a result
  215.     - maybe also part of the problem with sccache
  216.     - in buck, each lib gets its own directory, sidestepping this problem
  217. - linker want:
  218.   - ability to specify a specific mapping from link name to the native library
  219.   - option to ignore link directories or transform
  220.   - in buck case, if you have a dep on a native library, you get two options (`-lfoo` and full path to foo)
  221. - crate features, misuse thereof:
  222.   - people seem to want option to have mutually exclusive features
  223.   - want to have impls clone etc for testing but not in a release build
  224.   - hacked up something using cargo features but doesn't work all the time
  225.   - problems:
  226.     - dev dependency `foo` with feature "testing"
  227.     - sometimes testing gets turned on semi-randomly (???)
  228.     - but you can also accidentally use "testing" in a normal tree
  229.   - deps for build scripts leak through to the real graph, perhaps part of the "semi-random" behavior
  230. - designing from the wrong direction, perhaps?
  231.   - a lot of requirements coming up that are "above and beyond" existing cargo spec and design
  232.   - contra: goal is to have cargo co-exist with buck/bazel/etc, these are the features needed for that?
  233. - do we want to build another tool that is not cargo?
  234.   - but everybody already has a tool and wants to use it
  235.   - but how can we do minimal work so that integration of cargo + these other tools is smoother
  236.     - working with rest of rust ecosys
  237. - de facto standard that crates.io + cargo have created
  238.   - defined entirely by impl of cargo
  239.   - only access at present is through cargo's impl
  240.   - refactoring cargo into indep chunks with better interfaces might be the sol'n (and has been discussed)
  241.     - cargo build plans, but they're not there yet
  242.   - key thing: version resolution, very much in cargo's domain, would be good to specify
  243. - external dependencies + FFI?
  244.   - can we use FFI to talk to rust?
  245.   - want module boundary between rust things, using ffi
  246.   - today: build scripts in cargo exist, common thing is to build+link to native libraries
  247.     - one of the things that cargo raise does, you can describe the purpose of a build.rs (e.g., primarily to produce that 3rd party lib)
  248.     - but you can translate that to a dep for that native library in your build system
  249. - summarize + action items?
  250.   - cramertj wants to know what
  251.   - dtolnay is working on a potential design ideas for a successor to build.rs
  252.     - cargo metadata description to specify what it is doing, maybe replace build.rs?
  253.     - just listing inputs would be a huge improvement
  254.       - yes but we want something that's *easier* than build.rs today, to incentivize it
  255.   - caching, can we improve it
  256.     - some of it may be low-hanging fruit, e.g. on mac `.a` file has timestamps
  257.     - but part of it is the growing popularity of procedural macros (`.so` are uncachable by sccache)
  258.       - if linker were more predictable, sccache could handle it, but it's not
  259.       - might be able to handle by separating out linking
  260. - how to translate cargo.toml etc?
  261.   - buck today runs cargo, takes output with dep info + rlib files
  262.   - but new tool goal is to determine from cargo metadata
  263.     - no way of "definitively connecting" resolved deps with unresolved deps
  264. - cargo vendor tends to be a bit overagressive
  265.   - lots of things people want, seems to vary between groups
  266. - when developing procedural macros, could do better job of noticing token stream output hasn't changed..
  267.   - incremental
  268.   - sccache sometimes handles that well (e.g. w/ build.rs)
  269. - related topic: distributed builds
  270.   - sccache has support for that
  271.     - but maybe sends whole dep folder, not always ok
  272.     - would need more precise dep information to handle that (passing precise info for *transitive* dependencies)
  273.       - `--extern` is precise, but transitive deps are still figured out by rustc
  274.   - related: would be nice if, for rustc, could pass all the sources explicitly
  275.     - in buck do you list all sources?
  276.       - yes but a lot of globs :)
  277. - would be nice to have a tool that handled all the easy cases, with room for "extra" cases here and there
  278.  
  279. - alex: interested in solving a lot of these issues and have thoughts
  280.   - open to talking later about this stuff
  281.   - a lot of small details, bug fixes, etc -- long road, no silver bullet
  282. - some kind of "enterprise cargo" place to hold this discussion(s)
  283. - a lot of needs boil down to:
  284.   - quick fix combined with longer re-architecture
  285.  
  286. ## FFI
  287.  
  288. - two distinct languages invoking one another
  289.   - sometimes linked into one process, sometimes cross process (RPC)
  290.   - COM requires symbols to be ABI compatible
  291. - inline assembly, direct syscalls
  292. - "C parity"
  293. - FFI with C and C++
  294. - FB is doing C++ interop, as is Google
  295. - FFI beyond C or C++?
  296.   - Java
  297.   - syscalls
  298.   - C# perhaps
  299.   - (Ruby, Python)
  300. - Bindings to other languages are often mediated through a C layer
  301. - Increasing number of users -- C and C++ wanting to consume Rust APIs
  302. - Concerns:
  303.   - unwinding
  304. - Cumulo: basically spent most of the last year preparing to do bidir FFI between Rust and C
  305.   - fairly larger codebase in a dialect of C
  306.   - rules you can impose on C side which helps sometimes
  307.   - in one direction (Rust calling C) we have been able to use bindgen
  308.   - but in the other direction (C calling Rust) we wrote a compiler plugin (uh oh) to generate C headers
  309. - Specification questions
  310.   - concerned about cross-lang lto revealing a lot of interactions
  311. - Cross-lang thin lto
  312. - Dynamic testing and static testing
  313. - Have aliasing rules proven to be a problem?
  314.   - FB: not so much. Mostly mediating rules through bindgen and trying to set things up to get compilation failures
  315.   - Google: currently checking for changes
  316. - Google: pursuing a bit ways to annotate C and C++ headers so that can generate safe rust signatures from it
  317.   - might be an interesting thing to standardize on
  318.   - bindgen has a cumbersome mechanism for that (do)
  319.   - would be nice to include small shim layers e.g. to translate to `Result`
  320. - FB:
  321.   - C++ codebase in FB uses exceptions, have wrappers that captures and converts exceptions, this becomes a `Result` on the Rust side
  322.     - manually annotating noexcept functions? basically all of them can
  323.     - C headers are manually created with a `try { } except` block in C++
  324.   - the code being interop'd is mostly C++ but have to manually write C APIs for it
  325.   - build with panic=abort? no, unwind
  326.     - also catching Rust exceptions at boundary?
  327.       - C code doesn't call into Rust code that often
  328.       - happy to make it abort though
  329.         - but mozilla wants to handle panics, though it does it by translating it into a swift/java exception
  330.           - usually the purpose is wanting to capture the call stack and report it
  331.           - in theory could panic=abort if could capture java stack
  332.   - FB sets a custom panic handler to report errors, then exits (could use panic=abort)
  333. - For COM FFI case? how handling virtual dispatch
  334.   - manual adaptation with vtables and things
  335.   - on Rust side, does that "look like" a trait?
  336.     - active area of investigation
  337.     - believe that (with proc macro support) can expose a trait that is actually a struct + vtable
  338.     - similar to what GNOME projects are doing for glib bindings
  339.     - mozilla does it for XPCOM, which is basically same thing
  340.   - various bits of existing crates, but it's mostly nasty
  341. - Jeremy: one thing I've been thinking about:
  342.   - standard set of library functions corresponding to C++ types
  343.   - e.g. some way to use std-string from within rust code
  344.   - good to have for templated types (unique-ptr, shared-ptr, and so on)
  345.   - all types that can be directly used from Rust in some way
  346.   - quite clunky today to have a C++ function that returns something Rust can use
  347.   - on C side, it'd use the plain C++ types
  348.   - but on Rust side, it'd invoke and do the right things
  349.   - one of the pieces needed for C++ interop
  350.     - instantiate the vec/string/other impls
  351.   - should this part of bindgen?
  352.     - missing part: manually instantiating separate things for each specialization
  353. - major topics of FFI
  354.   - being able to "use header files" and get a "reasonably safe" FFI in Rust
  355. - what are building blocks we'd need to move things to user space?
  356.   - template instantation list is one building block -- somebody has to write the tool, nothing needed from rustc
  357. - expectation is that there is always some work to manually bind
  358.   - but what is minimal work we can do to make it easy to translate
  359. - annotations might be company specific -- fb vs google?
  360.   - maybe? but can we collaborate?
  361.   - different C++ dialects and patterns in use
  362. - what about from other languages, esp. around C++?
  363.   - closest inspiration might come from Swift
  364. - rich bindings from Rust to C++ for hashmaps etc
  365.   - because FB uses thrift for RPC mechanism (and sometimes FFI)
  366.   - would be useful to be able to do tricks like that for hashmap and sets perhaps
  367.   - some kind of tool for consuming a C++ header file to automatically produce an interface in Rust
  368. - complication in some environments: multiple allocators
  369.  
  370. ## use of unsafe
  371.  
  372. - ms: would like to know how to control use of unsafe in codebase
  373. - google: grep
  374. - servo used the compiler directives to disallow unsafe where possible
  375.   - in some cases, allow unsafe within a specific file
  376.   - integrate with review tool to draw attention
  377. - unsafe is really many things: sometimes simple, sometimes not
  378. - C++ code: all unsafe? not reviewed under the same standard?
  379. - more interesting question is unsafe in dependencies
  380. - auditing in crate graph in general is a problem
  381.   - geometric growth of deps
  382. - how do you audit safe code?
  383. - would be great if there were some central place doing auditing (and getting paid to do it)
  384.   - but we'd also need some mechanism to declare what's been audited etc
  385.   - blessed crates and versions
  386.   - let crates.io metadata include auditing
  387. - presumably want to know also things like 2fa, review policy, etc
  388. - attacks these days are very targeted in other ecosystems -- e.g., replacing specific versions of crates to attack specific targets
  389. - number of deps are in the hundreds, ranging from a few hundred to ~800 depending on project
  390.   - in some cases, can pull in a frozen diff and not update
  391.   - but not all
  392. - auditing of the compiler itself?
  393.   - would prefer to have two implementations maybe
  394.  
  395. ## "governance"
  396.  
  397. - MS: do we know what's going into the compiler?
  398.   - do we know what changes are going in?
  399. - FB: not been a big concern of ours
  400.   - in some cases, had issues where things got stabilized or bug fixes that broke code
  401.   - would like to be canarying the nightly compiler regularly
  402.   - but having more impl's would increase confidence
  403. - ways to support?
  404.   - contracting
  405.   - full time hires
  406.   - how can we give $$ to rust org?
  407.     - need a foundation
  408.   - money/resources for Rust CI
  409. - participating in crater?
  410.   - working on a way to run crater and send back pass/fail
  411. - ecosystem support
  412.   - filling gaps in ecosystem
  413.   - supporting key crates
  414.   - helping to file GSoc proposals?
  415.  
  416. ## will we do this again? how to continue these conversations?
  417.  
  418. - don't need super frequent updates
  419. - most helpful thing is to identify topics and spin off topics
  420. - try to provide feedback for roadmap
  421. - organize a regular meeting on zulip to talk about issues
  422.   - quarterly maybe
  423. - we might want to consider f2f meetings in other conferences or at least in europe
  424.   - maybe rustfest
  425. - key point:
  426.   - don't want to alienate and separate enterprise from the Rust community at large
  427.   - focusing on working groups and zulip for communication is a win
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
Not a member of Pastebin yet?
Sign Up, it unlocks many cool features!
 
Top