Advertisement
Guest User

Untitled

a guest
Sep 15th, 2019
757
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 21.35 KB | None | 0 0
  1. # Rust in Large Organizations
  2.  
  3. **Initially taken by Niko Matsakis and lightly edited by Ryan Levick**
  4.  
  5. ## Agenda
  6.  
  7. - Introductions
  8. - Cargo inside large build systems
  9. - FFI
  10. - Foundations and financial support
  11.  
  12. ## Attending
  13.  
  14. - Joe, Microsoft, Seattle Rust Meetup
  15. - Tom at Mozilla, using Rust for sync
  16. - Lena at Mozilla, sync storage etc
  17. - Jack Moffit at FB, Libra team
  18. - Brian Anderson at Pingcap
  19. - acrichto
  20. - erickt
  21. - dtolnay, David Tolnay
  22. - Raj Vengalil, Azure IoT
  23. - cuviper, Redhat
  24. - Rain, FB
  25. - Jeremy F, FB
  26. - Manish
  27. - Ben, Google
  28. - Philip, Cumulo, Rust dev tools + infra
  29. - Remi, Qumulo
  30. - Sebastian, MS, pushing for Rust adoption from sec pov
  31. - Thomas Ekerd, MS, site reliability engineer
  32. - James, MS
  33. - Brandom Williams, FB
  34. - JR, Mozilla backend services
  35. - Phil
  36. - Will, crash ingestion mozilla
  37. - Stjepan, Ferrous system
  38.  
  39. ## cargo
  40.  
  41. - FB dev env -- backend services repo -- is mostly C++
  42. and Java. Very polyglot environment. Glued together with Buck,
  43. FB's Bazel.
  44. - Buck: Language agnostic. Supports Rust.
  45. - rustc drops in quite nicely, basically equivalent to C++ compiler.
  46. - wanted to use cargo but it just does too much to fit in
  47. - need to delineate parts of cargo that are desired with those that conflict with Buck
  48. - ecosys is big advantage for Rust but hard to separate from cargo
  49. - current scheme:
  50. - big cargo.toml including all the things used in internal repo
  51. - cargo builds artifacts that are presented to buck
  52. - buck can link against those
  53. - reasonably successful
  54. - but approaching 700 crates in transitive dep graph, getting very cumbersome to rebuild etc
  55. - plus pinned to a specific version of compiler (prebuilt artifacts)
  56. - works ok but build.rs build scripts are a big complication
  57. - specific cargo pain points:
  58. - build scripts
  59. - "features" feature
  60. - a lot of crates don't use features the way they're intended -- they're used for exclusive A or B choices
  61. - this creates the possibility to break the build
  62. - need some sort of "cfg" feature that represents forks of a crate
  63. - Google does a similar thing for fuschia
  64. - cargo builds 3rd party artifacts, normal build consumes those
  65. - problems:
  66. - handful of 3rd party artifacts depend on things built in tree
  67. - want to be able to do partial builds, e.g. w/o a feature, or just for some targets
  68. - developing for a new OS, so we compile some code for host, some for target
  69. - presently do 2 full builds, but it's a pain
  70. - don't have as much control over the flags getting passed to rustc as we'd like
  71. - dep flags + linker flags aren't as specific as we need to distribute deps that are needed for indiv targets
  72. - prototype using "cargo raise" (use "gn" (from Chrome) to generate ninja files)
  73. - based on a modification of cargo raise that generates bazel build files
  74. - has its own handling of build.rs stuff
  75. - rather than outputting build files, it outputs a json format that could be the basis for the proposed "cargo build plans" feature
  76. - would be good to know what inputs etc are needed, how this would fit for Buck
  77. - can Buck consume internal files?
  78. - gn is aware of the concept of a Rust target
  79. - cumulo build system:
  80. - doesn't use cargo, invokes rustc directly
  81. - cargo just builds json
  82. - build all deps as shared libraries, whether or not they want that
  83. - `.so` libraries, `.rmeta` files
  84. - hits a lot of problems
  85. - ran into problems, notably lack of support for build.rs -- have to reimpl cargo
  86. - building for 2 different targets
  87. - have own platform
  88. - linux target for procedural macros
  89. - need sometimes to pass flags that are target specific, build a target config map
  90. - would prefer to use cargo
  91. - does cargo raise support build.rs?
  92. - has some builtin support for build.rs?
  93. - not automatic: you declare purpose of build.rs
  94. - things that do rustc version detection?
  95. - sometimes you want to (e.g.) disable build.rs that supply native deps which come from bazel
  96. - why can't you run build.rs as part of the build tool?
  97. - fundamental problems:
  98. - no declared inputs, no declared outputs
  99. - buck/bazel etc has to know what files the build script is consuming, producing etc
  100. - also, they are arbitrary execution, which can be a security concern
  101. - proc macros have some similar concerns.
  102. - e.g., pest which looks at cargo source dir env variable and finds your grammar def'n file
  103. - doesn't fit well
  104. - one thing that was discussed years ago:
  105. - capability system for build.rs that restrict what scripts can do
  106. - e.g., read from this directory, write to that one
  107. - cargo can then audit/sandbox to enforce said rules
  108. - run build script in a sandbox
  109. - e.g. crossvm has an impl of this inside of chrome; all crossvm devices run in their own jail
  110. - nontrivial engineering effort
  111. - could do at a higher level, sandbox
  112. - jeremy: build scripts classified into 3 or 4 distinct types, is this complete?
  113. - doing codegen. read a file, bindgen, etc
  114. - gateway to some other library, using pkgconfig or something to find the library, or they build it from source
  115. - feature detection on rustc
  116. - "scary ones" -- database reads, timestamps
  117. - plausibly could address those use cases in other ways
  118. - feature detection is an obvious one, e.g. we had an rfc for compiler versions
  119. - version compat is a common thing
  120. - what version of rust are people using?
  121. - stable
  122. - "stableish" -- bootstrap
  123. - nightly
  124. - who here is using toolchains distributed by rust?
  125. - ms (partially), mozilla, libra
  126. - why a custom toolchain?
  127. - config.toml tweaks
  128. - use clang's version of some unwinding code
  129. - custom linker
  130. - panic=abort
  131. - custom targets
  132. - compliance reasons (wanting to build from source for security reasons)
  133. - bootstrapping + compliance
  134. - where to get initial rust version?
  135. - several attempts:
  136. - most successful is using mrustc at version 1.22 and building from there
  137. - ms, google did that
  138. - is there a possibility of long term drift?
  139. - builds are not *quite* reproducible at present, but almost
  140. - was a point where build w/ mrustc + build with toolchain had non-matching hashes
  141. - might have to tweak the paths
  142. - in principle it can be done, should maybe prioritize it
  143. - maybe have an approved "how to bootstrap from C" documentation
  144. - specific reason fb builds from source:
  145. - want to always have the option to apply a local patch
  146. - don't want to get stuck with a "we must have this patch yesterday" scenario and have to figure out how to apply patch then
  147. - in most cases, also building llvm, want to share llvm for cross-lang LTO
  148. - must have a newer LLVM than what rust ships with
  149. - some folks have cross-lang LTO working
  150. - but rustc doesn't want to produce bitcode files
  151. - pass the linker `/bin/echo`
  152. - pgo -- coming soon
  153. - fb uses after the fact binary rewriting
  154. - splitting out linker was a potential change to rustc or cargo that google wants
  155. - would be interesting to know "here is what must be passed to gcc to successfully link"
  156. - another option: give a python script as the linker
  157. - turns out servo does it, too
  158. - show of hands survey:
  159. - "who is interested in a common backend for 'those things'"
  160. - nobody knows what that means
  161. - buck needs a "fully specified dep dag", seems like a common thing for other build systems
  162. - seems like we have to do a few cases to work out the general rules first
  163. - rudimentary cargo build plan support:
  164. - gives a dag of rustc executions
  165. - but it's too low level for buck, also bazel
  166. - pressure: every once in a while people propose "rewriting cargo.toml" into the tree
  167. - so far resisted that
  168. - a possible outcome buck has thought of:
  169. - buck support for cargo.toml
  170. - ton of code that's open source for people (natch) don't want to build w/ buck out of tree
  171. - want ability to simultaneously maintain buck/cargo support
  172. - currently done by hand and horrible
  173. - internally even people want this for mac/win builds which buck doesn't support
  174. - google w/ gn does something similar, keeps cargo.toml in order to upstream it
  175. - in some cases can generate a cargo.toml file programatically
  176. - also imp't for IDE support
  177. - IDE support
  178. - RLS kind of working with buck
  179. - knowing laughter :)
  180. - problematic assumptions: e.g., searching the filesystem for cargo.toml, but it's millions of files
  181. - symptom of a larger thing
  182. - cargo is designed for managing rust code
  183. - assumes source tree is mostly rust code
  184. - but often rust is embedded in a large source tree with tons of non-rust
  185. - so having some "root for all rust code" where you search below is problematic
  186. - top-level directory not gonna work
  187. - always having to create artificial "root" directories
  188. - rust-analyzer avoids this by not baking cargo in as deeply
  189. - but still has this "top level directory" model that contains all the rust code which means a small amount of rust amongst everything else
  190. - generating a cargo.toml for 1 project works well, but when you have multiple targets that interact
  191. - cumulo has a ton of C and Rust code that must be all combined into one big final artifact
  192. - IDE support that avoids cargo is a must
  193. - current state of the art: ctags
  194. - cramertj: cargo.toml is basically the intermediate repr for specifying deps
  195. - are there other things one might want?
  196. - build system has its own custom language to do that description
  197. - can use that to generate cargo.toml files though for IDE etc
  198. - what changes might one want in a "non-cargo IDE language"?
  199. - maybe cargo would work fine
  200. - manish: does this also cause problems for clippy and rustfmt?
  201. - cargo.toml is also useful for this
  202. - who uses clippy? most folks
  203. - rustfmt? most folks
  204. - fb invokes it on individual files for that
  205. - libra uses cargo to build
  206. - "cacheability" (sccache) has gotten worse over time
  207. - procedural macros aren't getting cached (dylibs)
  208. - are other people doing anything with this?
  209. - ff has a distributed cache in the office
  210. - (buck does caching of everything)
  211. - native deps? also integrated into buck
  212. - assume that if a C dep changes, rust must be rebuilt?
  213. - `-lnative` is not very well-scoped (just to a directory, not specific libs)
  214. - problem: can't cache link steps as a result
  215. - maybe also part of the problem with sccache
  216. - in buck, each lib gets its own directory, sidestepping this problem
  217. - linker want:
  218. - ability to specify a specific mapping from link name to the native library
  219. - option to ignore link directories or transform
  220. - in buck case, if you have a dep on a native library, you get two options (`-lfoo` and full path to foo)
  221. - crate features, misuse thereof:
  222. - people seem to want option to have mutually exclusive features
  223. - want to have impls clone etc for testing but not in a release build
  224. - hacked up something using cargo features but doesn't work all the time
  225. - problems:
  226. - dev dependency `foo` with feature "testing"
  227. - sometimes testing gets turned on semi-randomly (???)
  228. - but you can also accidentally use "testing" in a normal tree
  229. - deps for build scripts leak through to the real graph, perhaps part of the "semi-random" behavior
  230. - designing from the wrong direction, perhaps?
  231. - a lot of requirements coming up that are "above and beyond" existing cargo spec and design
  232. - contra: goal is to have cargo co-exist with buck/bazel/etc, these are the features needed for that?
  233. - do we want to build another tool that is not cargo?
  234. - but everybody already has a tool and wants to use it
  235. - but how can we do minimal work so that integration of cargo + these other tools is smoother
  236. - working with rest of rust ecosys
  237. - de facto standard that crates.io + cargo have created
  238. - defined entirely by impl of cargo
  239. - only access at present is through cargo's impl
  240. - refactoring cargo into indep chunks with better interfaces might be the sol'n (and has been discussed)
  241. - cargo build plans, but they're not there yet
  242. - key thing: version resolution, very much in cargo's domain, would be good to specify
  243. - external dependencies + FFI?
  244. - can we use FFI to talk to rust?
  245. - want module boundary between rust things, using ffi
  246. - today: build scripts in cargo exist, common thing is to build+link to native libraries
  247. - one of the things that cargo raise does, you can describe the purpose of a build.rs (e.g., primarily to produce that 3rd party lib)
  248. - but you can translate that to a dep for that native library in your build system
  249. - summarize + action items?
  250. - cramertj wants to know what
  251. - dtolnay is working on a potential design ideas for a successor to build.rs
  252. - cargo metadata description to specify what it is doing, maybe replace build.rs?
  253. - just listing inputs would be a huge improvement
  254. - yes but we want something that's *easier* than build.rs today, to incentivize it
  255. - caching, can we improve it
  256. - some of it may be low-hanging fruit, e.g. on mac `.a` file has timestamps
  257. - but part of it is the growing popularity of procedural macros (`.so` are uncachable by sccache)
  258. - if linker were more predictable, sccache could handle it, but it's not
  259. - might be able to handle by separating out linking
  260. - how to translate cargo.toml etc?
  261. - buck today runs cargo, takes output with dep info + rlib files
  262. - but new tool goal is to determine from cargo metadata
  263. - no way of "definitively connecting" resolved deps with unresolved deps
  264. - cargo vendor tends to be a bit overagressive
  265. - lots of things people want, seems to vary between groups
  266. - when developing procedural macros, could do better job of noticing token stream output hasn't changed..
  267. - incremental
  268. - sccache sometimes handles that well (e.g. w/ build.rs)
  269. - related topic: distributed builds
  270. - sccache has support for that
  271. - but maybe sends whole dep folder, not always ok
  272. - would need more precise dep information to handle that (passing precise info for *transitive* dependencies)
  273. - `--extern` is precise, but transitive deps are still figured out by rustc
  274. - related: would be nice if, for rustc, could pass all the sources explicitly
  275. - in buck do you list all sources?
  276. - yes but a lot of globs :)
  277. - would be nice to have a tool that handled all the easy cases, with room for "extra" cases here and there
  278.  
  279. - alex: interested in solving a lot of these issues and have thoughts
  280. - open to talking later about this stuff
  281. - a lot of small details, bug fixes, etc -- long road, no silver bullet
  282. - some kind of "enterprise cargo" place to hold this discussion(s)
  283. - a lot of needs boil down to:
  284. - quick fix combined with longer re-architecture
  285.  
  286. ## FFI
  287.  
  288. - two distinct languages invoking one another
  289. - sometimes linked into one process, sometimes cross process (RPC)
  290. - COM requires symbols to be ABI compatible
  291. - inline assembly, direct syscalls
  292. - "C parity"
  293. - FFI with C and C++
  294. - FB is doing C++ interop, as is Google
  295. - FFI beyond C or C++?
  296. - Java
  297. - syscalls
  298. - C# perhaps
  299. - (Ruby, Python)
  300. - Bindings to other languages are often mediated through a C layer
  301. - Increasing number of users -- C and C++ wanting to consume Rust APIs
  302. - Concerns:
  303. - unwinding
  304. - Qumulo: basically spent most of the last year preparing to do bidir FFI between Rust and C
  305. - fairly larger codebase in a dialect of C
  306. - rules you can impose on C side which helps sometimes
  307. - in one direction (Rust calling C) we have been able to use bindgen
  308. - but in the other direction (C calling Rust) we wrote a compiler plugin (uh oh) to generate C headers
  309. - Specification questions
  310. - concerned about cross-lang lto revealing a lot of interactions
  311. - Cross-lang thin lto
  312. - Dynamic testing and static testing
  313. - Have aliasing rules proven to be a problem?
  314. - FB: not so much. Mostly mediating rules through bindgen and trying to set things up to get compilation failures
  315. - Google: currently checking for changes
  316. - Google: pursuing a bit ways to annotate C and C++ headers so that can generate safe rust signatures from it
  317. - might be an interesting thing to standardize on
  318. - bindgen has a cumbersome mechanism for that (do)
  319. - would be nice to include small shim layers e.g. to translate to `Result`
  320. - FB:
  321. - C++ codebase in FB uses exceptions, have wrappers that captures and converts exceptions, this becomes a `Result` on the Rust side
  322. - manually annotating noexcept functions? basically all of them can
  323. - C headers are manually created with a `try { } except` block in C++
  324. - the code being interop'd is mostly C++ but have to manually write C APIs for it
  325. - build with panic=abort? no, unwind
  326. - also catching Rust exceptions at boundary?
  327. - C code doesn't call into Rust code that often
  328. - happy to make it abort though
  329. - but mozilla wants to handle panics, though it does it by translating it into a swift/java exception
  330. - usually the purpose is wanting to capture the call stack and report it
  331. - in theory could panic=abort if could capture java stack
  332. - FB sets a custom panic handler to report errors, then exits (could use panic=abort)
  333. - For COM FFI case? how handling virtual dispatch
  334. - manual adaptation with vtables and things
  335. - on Rust side, does that "look like" a trait?
  336. - active area of investigation
  337. - believe that (with proc macro support) can expose a trait that is actually a struct + vtable
  338. - similar to what GNOME projects are doing for glib bindings
  339. - mozilla does it for XPCOM, which is basically same thing
  340. - various bits of existing crates, but it's mostly nasty
  341. - Jeremy: one thing I've been thinking about:
  342. - standard set of library functions corresponding to C++ types
  343. - e.g. some way to use std-string from within rust code
  344. - good to have for templated types (unique-ptr, shared-ptr, and so on)
  345. - all types that can be directly used from Rust in some way
  346. - quite clunky today to have a C++ function that returns something Rust can use
  347. - on C side, it'd use the plain C++ types
  348. - but on Rust side, it'd invoke and do the right things
  349. - one of the pieces needed for C++ interop
  350. - instantiate the vec/string/other impls
  351. - should this part of bindgen?
  352. - missing part: manually instantiating separate things for each specialization
  353. - major topics of FFI
  354. - being able to "use header files" and get a "reasonably safe" FFI in Rust
  355. - what are building blocks we'd need to move things to user space?
  356. - template instantation list is one building block -- somebody has to write the tool, nothing needed from rustc
  357. - expectation is that there is always some work to manually bind
  358. - but what is minimal work we can do to make it easy to translate
  359. - annotations might be company specific -- fb vs google?
  360. - maybe? but can we collaborate?
  361. - different C++ dialects and patterns in use
  362. - what about from other languages, esp. around C++?
  363. - closest inspiration might come from Swift
  364. - rich bindings from Rust to C++ for hashmaps etc
  365. - because FB uses thrift for RPC mechanism (and sometimes FFI)
  366. - would be useful to be able to do tricks like that for hashmap and sets perhaps
  367. - some kind of tool for consuming a C++ header file to automatically produce an interface in Rust
  368. - complication in some environments: multiple allocators
  369.  
  370. ## use of unsafe
  371.  
  372. - ms: would like to know how to control use of unsafe in codebase
  373. - google: grep
  374. - servo used the compiler directives to disallow unsafe where possible
  375. - in some cases, allow unsafe within a specific file
  376. - integrate with review tool to draw attention
  377. - unsafe is really many things: sometimes simple, sometimes not
  378. - C++ code: all unsafe? not reviewed under the same standard?
  379. - more interesting question is unsafe in dependencies
  380. - auditing in crate graph in general is a problem
  381. - geometric growth of deps
  382. - how do you audit safe code?
  383. - would be great if there were some central place doing auditing (and getting paid to do it)
  384. - but we'd also need some mechanism to declare what's been audited etc
  385. - blessed crates and versions
  386. - let crates.io metadata include auditing
  387. - presumably want to know also things like 2fa, review policy, etc
  388. - attacks these days are very targeted in other ecosystems -- e.g., replacing specific versions of crates to attack specific targets
  389. - number of deps are in the hundreds, ranging from a few hundred to ~800 depending on project
  390. - in some cases, can pull in a frozen diff and not update
  391. - but not all
  392. - auditing of the compiler itself?
  393. - would prefer to have two implementations maybe
  394.  
  395. ## "governance"
  396.  
  397. - MS: do we know what's going into the compiler?
  398. - do we know what changes are going in?
  399. - FB: not been a big concern of ours
  400. - in some cases, had issues where things got stabilized or bug fixes that broke code
  401. - would like to be canarying the nightly compiler regularly
  402. - but having more impl's would increase confidence
  403. - ways to support?
  404. - contracting
  405. - full time hires
  406. - how can we give $$ to rust org?
  407. - need a foundation
  408. - money/resources for Rust CI
  409. - participating in crater?
  410. - working on a way to run crater and send back pass/fail
  411. - ecosystem support
  412. - filling gaps in ecosystem
  413. - supporting key crates
  414. - helping to file GSoc proposals?
  415.  
  416. ## will we do this again? how to continue these conversations?
  417.  
  418. - don't need super frequent updates
  419. - most helpful thing is to identify topics and spin off topics
  420. - try to provide feedback for roadmap
  421. - organize a regular meeting on zulip to talk about issues
  422. - quarterly maybe
  423. - we might want to consider f2f meetings in other conferences or at least in europe
  424. - maybe rustfest
  425. - key point:
  426. - don't want to alienate and separate enterprise from the Rust community at large
  427. - focusing on working groups and zulip for communication is a win
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement