Here are some of my notes from the Programming Rust O’Reilly book. I’m posting them here and in a gist on github so they’re searchable to me and anyone else who needs them.
Chapter 1
(omitted for brevity)
Chapter 2: Tour of Rust
Typical for function’s return value to “fall off the end of the function”
Tests marked with #[test] attribute
Trait is a collection of methods that types can implement
Iterators common in rust
Rust doesn’t have exceptions; handle using Result or panic.
By conventions modules named prelude mean its exports are intended to be used together, provide what you need.
_ tells rust that a variable will be unused, so it doesn’t complain.
r#"…" is rusts “raw string” syntax. can use matching numbers of hash marks.
cargo run does everything: fetches crates, compiles, builds, links, and starts
if your program compiles, it’s free of data races
Result and Option are the two maybe-ish data types that rust mostly uses
/// to start a documentation comment
common to init a structs fields with variables of same name. similar to typescript.
“[fallible functions in Rust should return a Result, which is Ok(x) on success, or Err(e) on failure]”
? operator checks and panics. don’t use in main function
|thing| { … } is a Rust closure expression - value that can be called as if it were a function
use
movekeyword in front of closures to let closure take ownership of variables it uses
Chapter 3: Basic Types
Rust pushes everything it can to ahead-of-time compilation for safety
Can use generics, and can infer a lot of things for you so you don’t have to spell everything out
can use underscores in long digits for legibility: 1_000_000
can convert integer types between each other using the
asoperatormethod calls have higher precedence that unary prefix operators
“rust performs no numeric conversions explicitly”
tuple is number of values of assorted types - like python
tuples accessed via index:
t.0ort.8etcrust often uses tuples when returning multiple types from a function
zero tuple is () and called a “unit type”
three pointer types: reference, boxes, and unsafe pointers
“reference is a pointer to any value anywhere”
references are never null
&Tis an immutable reference&mut Tis a mutable referenceboxes allocate a new value to the heap
when boxes go out of scope the memory is freed immediately, unless they get moved
raw pointers are unsafe, and they might be null so be careful
use raw pointers only in unsafe block where safety is up to you
rust has three types for representing sequences of values in memory: arrays, vectors, and slices
array: constant size determined at compile time; can’t append or shrink
vector: dynamically allocated, growable
slice: series of elements apart of some other value, like array of vector
slices are considered shared or mutable: not both.
rust has no notation for an uninitialized array.
create vectors with the
vec!macrovectors are made of: pointer to heap allocated buffer, capacity, and length
slices are a region of array or vector, made of a fat-pointer (two word value: pointer to first element, number of elements)
strings are stored as UTF-8:
Vec<u8>&str is really just a ref of some utf-8 text owned by another thing
String is different from &str. String is basically
Vec<T>when String goes out of scope buffer is freed, unless it was moved and is still owned, which brings us to….
Chapter 4: Ownership
every single value has an owner that determines lifetime
no garbage collection, or just throwing values around
when value is freed it is “dropped” -> allows us to look at code to see lifetime instead of guessing or inspecting comments, etc.
“[variables own their values, structs own their fields, tuples, arrays and vectors own their elements]”
basically trees: value’s owner is parent, values owned are children
so when dropping values, Rust is removing it from tree to free it
you can “move” values from one owner to another
no multiple owners, which
RcandArcbeing exceptionsyou can “borrow” references though; references are non-owning pointers with limited lifetimes
in Rust; most of the time assigning, passing, and returning don’t copy the value, they move it
compiler will complain about moving it more than once unless you returned it
“[price you pay is that you have to explicitly as for copies if you need/want them]”
if you have a mutable variable however, rust drops prior value on reassignment
in general: “[passing arguments to functions moves ownership to the functions parameters; returning a value from a function moves ownership to the caller]”
same for building tuples, etc.
things to remember about moves: 1) moves apply to value proper, NOT heap storage, 2) Rust’s compile sees through moves, so they’re pretty efficient
in general: if you move something into a function, move it back if you want it back
moving index content gets rejected: vectors and arrays for example. what you usually want is a reference
ownership applies to scope too; for loops for example take ownership
“[if you need to move value out of owners that compilers can’t track, you should probably change the owners type to something that can dynamically track whether it has a value or not]”
while most types are moved, some values are
Copy types; source of assignment stays the same,passing Copy types to constructors behaves similarly: source stays the same
“Only types for which a simple bit-for-bit copy can be Copy”
in general: any any type that needs something special to be done when the value is dropped, can’t be Copy.
you can derive from Copy if all the fields of a struct are Copy
Copy types are more flexible, but they’re also strict about which types they can contain: only Copy-able
important! -> if you use Copy, but need to change it later, it’ll be difficult to re-write the code to do non-copy stuff.
for shared ownership there is:
Rc: reference counted, andArcfor atomic reference counted. let you do thread-safe ownership sharing.cloning Rc doesn’t create a new one, just a new pointer
Chapter 5: References
references are non-owning pointers that have no effect on their referents lifetimes
a reference to a value is borrowing the value, and you must eventually return it to its owner
shared reference: read but not modify. eg:
&T, think: multiple readers at compile timemutable reference: both read and modify: eg:
&mut T, think: single writer at compile timemutable borrow = pass by value
reference = pass by reference
assigning to a reference makes it point at a new value
“[the . follows as many references as it takes to find its target]”
references are never null (no NPE!)
another kind of fat pointer is a trait object: a reference to a value that implements a certain trait
“[Rust tries to assign each reference type in your program a lifetime that meets the constraints imposed by how it is used]”
basically; a variable’s lifetime must contain that of the reference borrowed from it. so start by understanding the constraints from references, then find lifetimes that satisfy those constraints
static = rust’s global variable, created when the program starts, lasts until termination
lifetime parameters = specify lifetimes in generics and traits using
f<'a>(p: &a' i32)\{...\}which lets Rust track lifetimes explicitly. only need to do in definitions, not in usage.A function’s signature always exposes the body’s behavior
Lifetimes in functions signatures are so Rust’s compiler can ensure safety
when a reference type appears inside another type’s definition, you need to write out the lifetime
“[for a lifetime, a shared reference makes its referent read-only; you can’t assign the referent or move its value]”
a mutable reference borrow, and a shared reference borrow cannot have overlapping lifetimes (can’t borrow a mutable reference to a read-only value)
shared access is read-only access, mutable access is exclusive access
all of this ensures that a concurrent Rust program is free of data races by construction (at compile time)
rust is an expression language: for example,
ifandmatchcan produce valuesblocks can also produce values, and can be an explicit way to marking lifetimes inside another function or block
variables can be redeclared, but probably don’t do it.
item declarations: any declaration that could appear globally, like
fnmatch expressions are like switches but better because you can do patterns, they return things
in match a value is checked against each pattern in order; must match one pattern
four types of loops:
while,while let,loop,forforloops use iterable Range: eg:0..9can use
breakbut it only works in loopscan use
continueto advance to the next iteration of loopcan label loops with lifetimes
expressions that don’t finish have special return type
!but it’s rarely used - divergent functions“[ the . operator automatically dereferences or borrows as many references as needed ]”
static methods:
Vector::newor if it’s typed, use “turbo fish” :Vector::<Thing>newfields accessed via dot name, elements in tuples accessed via index
lvalues= access array or slice by index likething[i]using range
..operator allows open on either end:0..4or..4or0..or..*operator is used to access value pointed to by a referencedividing by zero triggers a panic. you can do it safely by other means
assignment not as common in rust since stuff is immutable
if a value has a non-Copy type, then assignment moves it to the destination.
Rust doesn’t have increment and decrement operators. Good.
numbers may be cast to/from any of the built-in numeric types for the most part
closures are lightweight function like values, sort of like anonymous functions
Chapter 7: Error Handling
most ordinary errors are handled by
Resultwhich is eitherOk<T>orErr<E>panics are another type of error that should never happen or at least represent something so bad that your program should just be done
panics can be handled though. by unwinding: dropping values in the reverse order of creation, all the way up the stack, finally exiting the thread
it’s not like panics are undefined. panics are defined behavior, they just straight up shouldn’t happen. but they are safe.
panics happen per thread
if, in the process of an unwind, rust panics on a
drop, the whole unwinding process is aborted; you messed up while trying to clean up your mess, and you’re done, go home.Result<T, E>is best understood with matches, but has easier ways of defaulting, getting values and references, panicking, and so onerrors are printable: they have a description and cause (if cause is provided)
Rust has
?operator to get a value from a Result, and propagate the error up the stack if any occurscan’t use
?in maincan use
.unwrap()to deal with errors that “can’t happen” :)“[
.unwrap()is for a condition so severe or bizarre that you don’t know what do to.]”use
let _ = ...to silence unused var warnings from compileron the whole; Rust forces you to make decisions about what to do with errors, rather than just throwing them. since they’re a part of the return type with
Resultyou have to do something with them
Chapter 8: Crates and Modules
“[A crate is a Rust project: all source code for a single library or executable, plus any associated tests, examples, tools, configs]”
extern create xyzis like an import statement, but for crates that aren’t a part of this projectCargo.tomlis the project file:cargocommand runs on this, basically doing anything you need to do with a cratecrates are bin or lib
cargo buildbuilds,cargo build --releasebuilds for release,cargo testtests. those are probably the only ones you needmodules are namespaces; containers for functions, constants, and so on. used for organization within a project. can nest them
mark things as
pubor not. anything not markedpubis privatefiles are modules with their file name. sub-directories are modules of their dir name
use
mod.rsto control sub-directory modulewhen you build a rust crate you are compiling all its modules
::operator accesses items inside a moduleuse thing::whateverto use a moduleeach module starts blank, need to import everything it uses
superis alias to parent module,selfis for this moduleitems: building blocks of rust. are 1) functions, 2) types, 3) type aliases, 4)
implblocks, 5) constants, 6) modules, 7) importsattributes: Rust’s catch-all syntax for writing miscellaneous instructions for the compiler. sort of like decorators or annotations in other languages.
eg: conditional compilation with
#[cfg]can use
#!to start attribute to tell Rust to attach it to enclosing item, like module. usually used at the beginning of a filetests use attributes like
#[test]to tellcargowhich functions are tests so they’re conditionally compiledtests usually use macros like
assert_eq!(expected, actual)“[by convention keep tests inside same file they’re testing, until they get big, then split them into
tests.rsand mark the whole file with#[cfg(test)]so they don’t get compiled]”integration tests usually live alongside your
/srcdirectory of your project: for testing surface area of your code, as a user wouldcargo doccreates html documentation from your code from thepubfeatures of your code with any doc commentscan use doc comments like
///or you can put them as annotations with#[doc = "comment here"]can use markdown for comments, and they’ll get translated to html
can also put in fenced code blocks
“doc-tests” are checked by rust when generating docs, to ensure that they compile, and run
goal is to get you to write the best possible documentation
can mark code snippets as
no-runto if they’re not completedependencies in
Cargo.tomlcan use the crates.io name, a path to a local crate, or a git link to something like GitHubuses semantic versioning
Cargo.lockkeeps versions the same across installation so deps change only when you want them docargo updateupdates versions for cratescan also do workspaces, which is just a collection of crates, each with their own
.toml
Chapter 9: Structs
three: named-field, tuple-like, unit like
named-field structs: camel case names, snake case fields. can do shorthand {x, y} to populate fields on construction
fields on structs are private by default
creating a struct value requires all fields to be initialized
can use something like a “spread” syntax,
\{x=0, ..other\}tuple-like structs: camel case name, fields are accessed by index
tuple elements private by default, but can be marked public
“[good for new-types; structs with a single thing that you use to get stricter type checking]”
eg:
struct Ascii(Vec<u8>)unit-like structs: struct type with no elements. basically just
type MyThingmethods on a struct appear in a different block:
implimplblock just a collections offndefinitions that become methods on the struct typeif they need it, methods are passed
selfas the first argument, or&self, or&mut selfif needed. use only what you need.calling a method on a struct is implicit mutable ref in many cases
conventional to have a constructor called
newif you want you can attach your own methods to other types
methods are separated to: 1) make it easy to find data members, 2) allow single syntax for all struct types, 3) allow implementation of traits
rust has generics :)
similar to other generic languages: use
<T>for type parameterscan use Self instead of always spelling out the type
for static method calls, used turbo-fish using
::<>if a struct type contains references you must name the references’ lifetimes
eg:
struct Position<'p> {
x: & 'p i32,
x: & 'p i32,
}
which means “for lifetime ‘p, you can have a Position that holds references to that lifetime”
this lets rust do lifetime constrain checking
traits are implemented on an
implblock, or defined by themselvesthink of it as an easier way to do Java abstract classes
use
#derive[Clone]to derive traits without having to write themfor most traits, you can depend on trait inheritance -> as long as all members of a struct impl trait X, you can derive trait X
interior mutability is when a struct is immutable, but it has fields that need to be mutable
interior mutability is usually done with
Cell<T>andRefCell<T>but you can do it withBox<T>tooCell<T>struct that contains a single value of T. can set and get T even if you don’t have access to Cell itselfCell<T>doesn’t left you callmutmethods on the shared valueRefCell<T>is likeCellbut you can borrow TRefCell does runtime checks, but not compile time checks, so if you break rules it panics
neither is thread safe
Chapter 10: Enums and Patterns
rust enums can contain data, and data of varying types
rust enums offer type safety (something that is really hard if you’re doing some types of polymorphic stuff in java, for example)
drawback is that you only access them using pattern matching
can do numbered enums, or enums that contain fields like structs
enums can have methods like structs, just use impl
enums can also have struct variants, which have named fields
enums can be primitive, tuple-like, or struct-like, or all three
classic case for enums is doing polymorphism, or tree-like structures
enums can also be generic, type params used on enum cases, etc
Rust won’t let you access data stored in enums unless you check with
match
something like
match x {
Ok(v) => println!("{}", v)
Unknown(err) => println!("{}", err),
_ => panic!("uh oh")
}
patterns consume values, expressions produce values
patterns are to the left of
=>, expressions are on the right of=>match must be exhaustive. you need to do something for any given match
_is a wildcard pattern and matches everythigyou can create variables in patterns, or use literals, but you can use existing variables
you can use tuples in patterns, you can use structs in patterns
ref patterns borrow parts of a matched value
& patterns match references
“[patterns and expressions are opposites. eg: (x, y) as a pattern consumes tuple - pulling values, (x, y) as expression creates a tuple]”
pattern guards are boolean evaluations added to patterns. but only when you’re not moving values
@ pattern matches the pattern, but it moves or copies the entire value into the produced variable
irrefutable patternsare patterns that always match“[patterns are a tool designed to get data into the right shape]”
Chapter 11: Traits and Generics
traits are Rust’s way of doing interfaces or abstract base classes
declare traits like an interface:
trait Write {
fn write() -> Result<usize>;
}
trait generics are related
bound is a way of declaring the trait requirements of type params in generics
“[trait represents a capability: something a type can do]”
(similar to Go’s “if it can do that, you can use it here”?)
in order for trait to be used, the trait itself must be in scope
“[two ways to do polymorphic code: traits and generics]”
trait object: reference to a trait type
combine trait types using
+sign likefn thing<T: Debug + Hash + Eq>() \{...\}can also use
whereclause so the type param doesn’t get too unreadablealso define lifetimes in generic type defs using
'a(side note that lifetimes have no impact on machine code - just tell rust how to check when compiling)
individual functions can be generic, even when the type they’re defined on is not
“[use trait objects when you need a collection of values of mixed types all together]”
generics have the advantage of speed: don’t need dynamic dispatch
another advantage of generics is not every trait can work on trait objects
defining traits is basically defining an interface: just types
extension traits: adding methods to existing types, similar to java’s extending other classes, but without rename
can use
Selfin types as shorthandcan define traits that extend other traits, like this:
trait Thing: Other {
...
}
traits can have static methods and constructors
fully qualified method calls can be called right off of the type:
str::to_string("Hello")use these when: 1) two methods have the same name, 2) when the type of the self-arg can’t be inferred, 3) calling traits in macros
associated types: sort of like scoped types, or related types. used iterables. similar to java’s “T extends E” in an abstract class
useful when placing bounds in a
whereclause“[associated types perfect for cases where implementation has one specific related type]”
buddy traits: traits designed to work together
overall: generics and traits stop you from braking other code. as long as types are the same, implementation can change
Chapter 12: Operator Overloading
(side note: this isn’t in the book, but operator overloading is hard, a little confusing, and probably something you should read a whole book about before you ever consider doing it. get some domain specific knowledge, and a solid use case, and maybe then you should do it.)
useful for comparing complex structs using ordered comparisons
PartialEqorEqorOrderingcan specify how index operations like
a[i]work usingIndexandIndexMutbe careful: overloading can be difficult to debug
Chapter 13: Utility Traits
big ones are
Drop,Sized,Clone,Copy,Deref,DerefMut,AsRef,AsMut,Borrow,BorrowMut,From,Into,ToOwnedtraits can let you break/bend rust’s rules. they can also let you use them properly, depending on how you implement them or derive them
eg: customize how rust drops values of your type by implementing
DropCopy: can only do if you’re doing shallow copy. no OS handles or anything exotic .(other individual ones reviewed are good example, but not note-worthy here. look up the docs.)
Chapter 14: Closure
closure: anonymous function expression
capture: use data that belongs to enclosing function by passing it in. eg:
|x| println!(x)borrow, or steal: borrow automatically by reference, steal by moving with
movekeywordeg:
move |x| println!(x)can do same things with closures that you do with other expressions, but they don’t have the same types as functions
“[all closures have their own type because they may contain data… so code that works with closures usually needs to be generic]”
Fnplain closure: call multiple times without restrictionFnOnceensures called onceFnMutcontains mutable data ormutreferences. not safe across threads.callback: function provided by the user. in Rust these are usually done as closures, usually with lifetimes
Chapter 15: Iterators
iterator: value that produces a sequence of values
Rust ones are “flexible, expressive, efficient”
any value that implements the
std::iter::Iteratortrait, and/orIntoIteratorcode that receives the items is a consumer
most collections types provide methods to produce: 1) shared reference, 2) mutable reference, 3) value
most have
drain: takes mutable reference, returns iterator that passes ownership of each element to consumeradapters: consume one iterator, producing another (eg:
mapandfilter)adapters are zero-overhead abstraction - cost nothing
(too many examples to list here)
many collections implement the
extendtrait, allowing one iterable to be combined with another
Chapter 16: Collections
collections: generic types for storing data in memory
mostly use moves to avoid deep-copying values
don’t have invalidation errors: can’t change collection while operating on it because of Rusts borrowing mechanisms
access by reference, or access by copy
Rust lets you borrow mutable references to two or more parts of an array, slice, or vector. safe because Rust dives them into non-overlapping regions.
two slices are equal if they’re the same length, and their corresponding elements are equal
notable:
Vec<T>,VecDeque<T>,LinkedList<T>,BinaryHeap<T>,HashMap<K, V>,BTreeMap<K, V>,HashSet<T>,BTreeSet<T>,Hash, and Hasher work via buddy-trait; “pluggable hashing”
Chapter 17: Strings and Text
Stringandstrare well-formed UTF-8 sequencesASCII subset of UTF-8
charis 32-bit value holding a Unicode code pointStringis just a wrapper around VecStringsupports operator overloading with theAddandAddAssigntraitscan use patterns to search, manipulate text
converting other types to human-readable formats through
Displaytrait, thenformat!macro can format without instruction(side note: use
Cowwhen you may or may not need to modify text that is borrowed)string templates must be constant (or be created through macros) in order to be type checked at compile time
(see text formatting online for more details about all formats, arguments, etc.)
debug:
\{:?\}, pretty-print:\{:?#\}rust’s
regexuses matches and patterns, so it’s safe for untrusted expressions/textregular expression compilation is expensive; do once, keep out of loops
Chapter 18: Input and Output
I/O organized around
Read,BufRead,WriteReader-> values that you can read bytes fromWriter-> values you can write bytes to.UTF-8 is the de facto standard in most Rust code
BufReadbasically same asReadbut uses chunks, more configurablereaders and writers are closed automatically when dropped - hold them if you want them open
File API uses builder-like pattern to open, create reader, and read
other reader + writer types:
io::stdin,io::stdout,io::stderr(a lot of Path, OS, File, Dir stuff. none of it particularly unique to rust. lookup docs.)
Chapter 19: Concurrency
“[Rust offers a better concurrency model, by not forcing all programs to adopt single style. Unwritten rules are written down, enforced by compiler.]”
fork-join parallelism: simple, avoids bottlenecks, straightforward, easy to reason about
std::thread::spawnnew OS thread, just like other languagesmove key, or workload via
moveinspawn:spawn(move || do_process(unit))moves are cheap; ok to do.
join back together
std::thread::Resultthat has err if child panickedsince panics are per-thread; this is ok.
rust checks closure of spawn: “[has no way of knowing how long child will run, assumes forever. requires lifetime.]”
can have child thread access across thread with
Arc<T>(“Atomic Ref. Counter”)other libraries use scoped threads, worker pools, work-stealing to push out more efficiency
channel: one-way conduit for sending values from one thread to another. sorta like Go.
channels are thread-safe queues
use
.send()and.recv(): fail if the other end of channel has been droppeduse move to pass channel to threads. channels are thread safe because they’re for threads
usually loop on the receiver; exit when sender drops
std::sync::mpscfor multi-producer, multi-consumerimplement
SenderandReceiverstd::sync::mpmc::sync_channel-> lets you do back pressure so the producers don’t overwhelm consumerstypes that implement
Sendare safe to pass by value to another threadtypes that implement
Syncare safe to pass by non-mut reference to another threadstruct or enum is
Sendif its fields areSend, same forSyncmutex-> lock forces threads to take turns; one thread has access at a time“support programming w/ invariants: rules that protect data by construction”
commonly
Arcfor sharing things across threads,Mutexfor mutable data shared across threadsmut and Mutex: “mut means exclusive access, non-mut is shared access”
“[Mutex provides exclusive (mut) access, even though some threads might have shared (non-mut) access.]”
Rust can’t protect you from being dumb; can’t stop deadlock
if a thread panics while holding mutex, it’s marked as “poisoned”, attempts to lock will get error
RWLocklike mutex but two locks; one for reading, one for writing: “one writer, or many readers, not both”std::sync::Convarconditional variables for waiting, notifying, when threads need somethingstd::async::Atomicatomic types for lock-free concurrent programmingwhen it comes to global state: don’t do it: “[tends to make parts of a program more tightly coupled]”
can use
statickeywordcan use
lazy_staticmacro, usually with mutex
Chapter 20: Macros
let you extend the language via pseudo-code-gen
each macro call is “expanded” - replace with Rust code
matching patterns to templates
usually use
macro_rules!but you could use other ways, or other macrosin Rust compilation process, it expands macros before looking at rest of program
macro patterns are basically regular expressions; use tokens
hard to debug because of expansion, but use
rustcto look at code after expansion, uselog_syntax!to print macro argsrust has hygienic macros; auto-names variables during expansion so you don’t need to worry about collisions, naming.
any identifiers you need inside a macro should be passed in as parameters
macros always visible to child modules
macros visible to parents through use of “#[export_macro]” to export, or “#[macro_use]” on module import
exported macros shouldn’t rely on anything in scope; macros should use absolute path names to anything they use
syntax errors in macros are fatal, only happen when trying to match fragments. avoid by putting more specific rules first
Chapter 21: Unsafe Code
unsafe code lets you tell Rust to trust you on this one.
use
unsafeas keyword, us in fn or blockcan use raw pointers and methods to allow unconstrained access to memory, access mutable static variables, use foreign function interface
- bugs that occur before unsafe block can break contracts, 2) consequences might occur outside unsafe block
good set of rules for Rust programs:
- must not read uninitialized memory
- must not create invalid primitive values
- no references outlive referents shared access is read only, mutable access is exclusive
- must not dereference null, or dangling pointers
- must not use pointers to access memory outside allocation of association
- free of data races
- must not unwind across a call made from another language
- comply with contracts from std library functions
raw pointer unconstrained pointer - Rust can’t tell if you’re using them safely. deref only in unsafe block
two types of raw pointers:
*mut T,*const T.operator doesn’t auto-deref. have to be explicit with(*thing).fieldcomparison operators use addresses; only equal if they share same address location
“[complete, exact contract for raw pointers is not easily state, and might change]”
null raw pointer is a zero address
foreign function interface -> lets rust call functions written in C, or C++
extern block: declares functions or variables defined in some other library that Rust is linked with