Notes from 'Programming Rust' book

Here are some of my notes from the Programming Rust O’Reilly book. I’m posting them here and in a gist on github so they’re searchable to me and anyone else who needs them.

Chapter 1

(omitted for brevity)

Chapter 2: Tour of Rust

Typical for function’s return value to “fall off the end of the function”
Tests marked with #[test] attribute
Trait is a collection of methods that types can implement
Iterators common in rust
Rust doesn’t have exceptions; handle using Result or panic.
By conventions modules named prelude mean its exports are intended to be used together, provide what you need.
_ tells rust that a variable will be unused, so it doesn’t complain.
r#"…" is rusts “raw string” syntax. can use matching numbers of hash marks.
cargo run does everything: fetches crates, compiles, builds, links, and starts
if your program compiles, it’s free of data races
Result and Option are the two maybe-ish data types that rust mostly uses
/// to start a documentation comment
common to init a structs fields with variables of same name. similar to typescript.
“[fallible functions in Rust should return a Result, which is Ok(x) on success, or Err(e) on failure]”
? operator checks and panics. don’t use in main function
|thing| { … } is a Rust closure expression - value that can be called as if it were a function
use move keyword in front of closures to let closure take ownership of variables it uses

Chapter 3: Basic Types

Rust pushes everything it can to ahead-of-time compilation for safety
Can use generics, and can infer a lot of things for you so you don’t have to spell everything out
can use underscores in long digits for legibility: 1_000_000
can convert integer types between each other using the as operator
method calls have higher precedence that unary prefix operators
“rust performs no numeric conversions explicitly”
tuple is number of values of assorted types - like python
tuples accessed via index: t.0 or t.8 etc
rust often uses tuples when returning multiple types from a function
zero tuple is () and called a “unit type”
three pointer types: reference, boxes, and unsafe pointers
“reference is a pointer to any value anywhere”
references are never null
&T is an immutable reference
&mut T is a mutable reference
boxes allocate a new value to the heap
when boxes go out of scope the memory is freed immediately, unless they get moved
raw pointers are unsafe, and they might be null so be careful
use raw pointers only in unsafe block where safety is up to you
rust has three types for representing sequences of values in memory: arrays, vectors, and slices
array: constant size determined at compile time; can’t append or shrink
vector: dynamically allocated, growable
slice: series of elements apart of some other value, like array of vector
slices are considered shared or mutable: not both.
rust has no notation for an uninitialized array.
create vectors with the vec! macro
vectors are made of: pointer to heap allocated buffer, capacity, and length
slices are a region of array or vector, made of a fat-pointer (two word value: pointer to first element, number of elements)
strings are stored as UTF-8: Vec<u8>
&str is really just a ref of some utf-8 text owned by another thing
String is different from &str. String is basically Vec<T>
when String goes out of scope buffer is freed, unless it was moved and is still owned, which brings us to….

Chapter 4: Ownership

every single value has an owner that determines lifetime
no garbage collection, or just throwing values around
when value is freed it is “dropped” -> allows us to look at code to see lifetime instead of guessing or inspecting comments, etc.
“[variables own their values, structs own their fields, tuples, arrays and vectors own their elements]”
basically trees: value’s owner is parent, values owned are children
so when dropping values, Rust is removing it from tree to free it
you can “move” values from one owner to another
no multiple owners, which Rc and Arc being exceptions
you can “borrow” references though; references are non-owning pointers with limited lifetimes
in Rust; most of the time assigning, passing, and returning don’t copy the value, they move it
compiler will complain about moving it more than once unless you returned it
“[price you pay is that you have to explicitly as for copies if you need/want them]”
if you have a mutable variable however, rust drops prior value on reassignment
in general: “[passing arguments to functions moves ownership to the functions parameters; returning a value from a function moves ownership to the caller]”
same for building tuples, etc.
things to remember about moves: 1) moves apply to value proper, NOT heap storage, 2) Rust’s compile sees through moves, so they’re pretty efficient
in general: if you move something into a function, move it back if you want it back
moving index content gets rejected: vectors and arrays for example. what you usually want is a reference
ownership applies to scope too; for loops for example take ownership
“[if you need to move value out of owners that compilers can’t track, you should probably change the owners type to something that can dynamically track whether it has a value or not]”
while most types are moved, some values are Copy types; source of assignment stays the same,
passing Copy types to constructors behaves similarly: source stays the same
“Only types for which a simple bit-for-bit copy can be Copy”
in general: any any type that needs something special to be done when the value is dropped, can’t be Copy.
you can derive from Copy if all the fields of a struct are Copy
Copy types are more flexible, but they’re also strict about which types they can contain: only Copy-able
important! -> if you use Copy, but need to change it later, it’ll be difficult to re-write the code to do non-copy stuff.
for shared ownership there is: Rc: reference counted, and Arc for atomic reference counted. let you do thread-safe ownership sharing.
cloning Rc doesn’t create a new one, just a new pointer

Chapter 5: References

references are non-owning pointers that have no effect on their referents lifetimes
a reference to a value is borrowing the value, and you must eventually return it to its owner
shared reference: read but not modify. eg: &T, think: multiple readers at compile time
mutable reference: both read and modify: eg: &mut T, think: single writer at compile time
mutable borrow = pass by value
reference = pass by reference
assigning to a reference makes it point at a new value
“[the . follows as many references as it takes to find its target]”
references are never null (no NPE!)
another kind of fat pointer is a trait object: a reference to a value that implements a certain trait
“[Rust tries to assign each reference type in your program a lifetime that meets the constraints imposed by how it is used]”
basically; a variable’s lifetime must contain that of the reference borrowed from it. so start by understanding the constraints from references, then find lifetimes that satisfy those constraints
static = rust’s global variable, created when the program starts, lasts until termination
lifetime parameters = specify lifetimes in generics and traits using f<'a>(p: &a' i32)\{...\} which lets Rust track lifetimes explicitly. only need to do in definitions, not in usage.
A function’s signature always exposes the body’s behavior
Lifetimes in functions signatures are so Rust’s compiler can ensure safety
when a reference type appears inside another type’s definition, you need to write out the lifetime
“[for a lifetime, a shared reference makes its referent read-only; you can’t assign the referent or move its value]”
a mutable reference borrow, and a shared reference borrow cannot have overlapping lifetimes (can’t borrow a mutable reference to a read-only value)
shared access is read-only access, mutable access is exclusive access
all of this ensures that a concurrent Rust program is free of data races by construction (at compile time)
rust is an expression language: for example, if and match can produce values
blocks can also produce values, and can be an explicit way to marking lifetimes inside another function or block
variables can be redeclared, but probably don’t do it.
item declarations: any declaration that could appear globally, like fn
match expressions are like switches but better because you can do patterns, they return things
in match a value is checked against each pattern in order; must match one pattern
four types of loops: while, while let, loop, for
for loops use iterable Range: eg: 0..9
can use break but it only works in loops
can use continue to advance to the next iteration of loop
can label loops with lifetimes
expressions that don’t finish have special return type ! but it’s rarely used - divergent functions
“[ the . operator automatically dereferences or borrows as many references as needed ]”
static methods: Vector::new or if it’s typed, use “turbo fish” : Vector::<Thing>new
fields accessed via dot name, elements in tuples accessed via index
lvalues = access array or slice by index like thing[i]
using range .. operator allows open on either end: 0..4 or ..4 or 0.. or ..
* operator is used to access value pointed to by a reference
dividing by zero triggers a panic. you can do it safely by other means
assignment not as common in rust since stuff is immutable
if a value has a non-Copy type, then assignment moves it to the destination.
Rust doesn’t have increment and decrement operators. Good.
numbers may be cast to/from any of the built-in numeric types for the most part
closures are lightweight function like values, sort of like anonymous functions

Chapter 7: Error Handling

most ordinary errors are handled by Result which is either Ok<T> or Err<E>
panics are another type of error that should never happen or at least represent something so bad that your program should just be done
panics can be handled though. by unwinding: dropping values in the reverse order of creation, all the way up the stack, finally exiting the thread
it’s not like panics are undefined. panics are defined behavior, they just straight up shouldn’t happen. but they are safe.
panics happen per thread
if, in the process of an unwind, rust panics on a drop, the whole unwinding process is aborted; you messed up while trying to clean up your mess, and you’re done, go home.
Result<T, E> is best understood with matches, but has easier ways of defaulting, getting values and references, panicking, and so on
errors are printable: they have a description and cause (if cause is provided)
Rust has ? operator to get a value from a Result, and propagate the error up the stack if any occurs
can’t use ? in main
can use .unwrap() to deal with errors that “can’t happen” :)
“[.unwrap() is for a condition so severe or bizarre that you don’t know what do to.]”
use let _ = ... to silence unused var warnings from compiler
on the whole; Rust forces you to make decisions about what to do with errors, rather than just throwing them. since they’re a part of the return type with Result you have to do something with them

Chapter 8: Crates and Modules

“[A crate is a Rust project: all source code for a single library or executable, plus any associated tests, examples, tools, configs]”
extern create xyz is like an import statement, but for crates that aren’t a part of this project
Cargo.toml is the project file: cargo command runs on this, basically doing anything you need to do with a crate
crates are bin or lib
cargo build builds, cargo build --release builds for release, cargo test tests. those are probably the only ones you need
modules are namespaces; containers for functions, constants, and so on. used for organization within a project. can nest them
mark things as pub or not. anything not marked pub is private
files are modules with their file name. sub-directories are modules of their dir name
use mod.rs to control sub-directory module
when you build a rust crate you are compiling all its modules
:: operator accesses items inside a module
use thing::whatever to use a module
each module starts blank, need to import everything it uses
super is alias to parent module, self is for this module
items: building blocks of rust. are 1) functions, 2) types, 3) type aliases, 4) impl blocks, 5) constants, 6) modules, 7) imports
attributes: Rust’s catch-all syntax for writing miscellaneous instructions for the compiler. sort of like decorators or annotations in other languages.
eg: conditional compilation with #[cfg]
can use #! to start attribute to tell Rust to attach it to enclosing item, like module. usually used at the beginning of a file
tests use attributes like #[test] to tell cargo which functions are tests so they’re conditionally compiled
tests usually use macros like assert_eq!(expected, actual)
“[by convention keep tests inside same file they’re testing, until they get big, then split them into tests.rs and mark the whole file with #[cfg(test)] so they don’t get compiled]”
integration tests usually live alongside your /src directory of your project: for testing surface area of your code, as a user would
cargo doc creates html documentation from your code from the pub features of your code with any doc comments
can use doc comments like /// or you can put them as annotations with #[doc = "comment here"]
can use markdown for comments, and they’ll get translated to html
can also put in fenced code blocks
“doc-tests” are checked by rust when generating docs, to ensure that they compile, and run
goal is to get you to write the best possible documentation
can mark code snippets as no-run to if they’re not complete
dependencies in Cargo.toml can use the crates.io name, a path to a local crate, or a git link to something like GitHub
uses semantic versioning
Cargo.lock keeps versions the same across installation so deps change only when you want them do
cargo update updates versions for crates
can also do workspaces, which is just a collection of crates, each with their own .toml

Chapter 9: Structs

three: named-field, tuple-like, unit like
named-field structs: camel case names, snake case fields. can do shorthand {x, y} to populate fields on construction
fields on structs are private by default
creating a struct value requires all fields to be initialized
can use something like a “spread” syntax, \{x=0, ..other\}
tuple-like structs: camel case name, fields are accessed by index
tuple elements private by default, but can be marked public
“[good for new-types; structs with a single thing that you use to get stricter type checking]”
eg: struct Ascii(Vec<u8>)
unit-like structs: struct type with no elements. basically just type MyThing
methods on a struct appear in a different block: impl
impl block just a collections of fn definitions that become methods on the struct type
if they need it, methods are passed self as the first argument, or &self, or &mut self if needed. use only what you need.
calling a method on a struct is implicit mutable ref in many cases
conventional to have a constructor called new
if you want you can attach your own methods to other types
methods are separated to: 1) make it easy to find data members, 2) allow single syntax for all struct types, 3) allow implementation of traits
rust has generics :)
similar to other generic languages: use <T> for type parameters
can use Self instead of always spelling out the type
for static method calls, used turbo-fish using ::<>
if a struct type contains references you must name the references’ lifetimes

eg:

struct Position<'p> {
 x: & 'p i32,
 x: & 'p i32,
}

which means “for lifetime ‘p, you can have a Position that holds references to that lifetime”
this lets rust do lifetime constrain checking
traits are implemented on an impl block, or defined by themselves
think of it as an easier way to do Java abstract classes
use #derive[Clone] to derive traits without having to write them
for most traits, you can depend on trait inheritance -> as long as all members of a struct impl trait X, you can derive trait X
interior mutability is when a struct is immutable, but it has fields that need to be mutable
interior mutability is usually done with Cell<T> and RefCell<T> but you can do it with Box<T> too
Cell<T> struct that contains a single value of T. can set and get T even if you don’t have access to Cell itself
Cell<T> doesn’t left you call mut methods on the shared value
RefCell<T> is like Cell but you can borrow T
RefCell does runtime checks, but not compile time checks, so if you break rules it panics
neither is thread safe

Chapter 10: Enums and Patterns

rust enums can contain data, and data of varying types
rust enums offer type safety (something that is really hard if you’re doing some types of polymorphic stuff in java, for example)
drawback is that you only access them using pattern matching
can do numbered enums, or enums that contain fields like structs
enums can have methods like structs, just use impl
enums can also have struct variants, which have named fields
enums can be primitive, tuple-like, or struct-like, or all three
classic case for enums is doing polymorphism, or tree-like structures
enums can also be generic, type params used on enum cases, etc
Rust won’t let you access data stored in enums unless you check with match

something like

match x {
 Ok(v) => println!("{}", v)
 Unknown(err) => println!("{}", err),
 _ => panic!("uh oh")
}

patterns consume values, expressions produce values
patterns are to the left of =>, expressions are on the right of =>
match must be exhaustive. you need to do something for any given match
_ is a wildcard pattern and matches everythig
you can create variables in patterns, or use literals, but you can use existing variables
you can use tuples in patterns, you can use structs in patterns
ref patterns borrow parts of a matched value
& patterns match references
“[patterns and expressions are opposites. eg: (x, y) as a pattern consumes tuple - pulling values, (x, y) as expression creates a tuple]”
pattern guards are boolean evaluations added to patterns. but only when you’re not moving values

@ pattern matches the pattern, but it moves or copies the entire value into the produced variable

irrefutable patterns are patterns that always match
“[patterns are a tool designed to get data into the right shape]”

Chapter 11: Traits and Generics

traits are Rust’s way of doing interfaces or abstract base classes
declare traits like an interface:

trait Write {
 fn write() -> Result<usize>;
}

trait generics are related
bound is a way of declaring the trait requirements of type params in generics
“[trait represents a capability: something a type can do]”
(similar to Go’s “if it can do that, you can use it here”?)
in order for trait to be used, the trait itself must be in scope
“[two ways to do polymorphic code: traits and generics]”
trait object: reference to a trait type
combine trait types using + sign like fn thing<T: Debug + Hash + Eq>() \{...\}
can also use where clause so the type param doesn’t get too unreadable
also define lifetimes in generic type defs using 'a
(side note that lifetimes have no impact on machine code - just tell rust how to check when compiling)
individual functions can be generic, even when the type they’re defined on is not
“[use trait objects when you need a collection of values of mixed types all together]”
generics have the advantage of speed: don’t need dynamic dispatch
another advantage of generics is not every trait can work on trait objects
defining traits is basically defining an interface: just types
extension traits: adding methods to existing types, similar to java’s extending other classes, but without rename
can use Self in types as shorthand
can define traits that extend other traits, like this:

trait Thing: Other {
 ...
}

traits can have static methods and constructors
fully qualified method calls can be called right off of the type: str::to_string("Hello")
use these when: 1) two methods have the same name, 2) when the type of the self-arg can’t be inferred, 3) calling traits in macros
associated types: sort of like scoped types, or related types. used iterables. similar to java’s “T extends E” in an abstract class
useful when placing bounds in a where clause
“[associated types perfect for cases where implementation has one specific related type]”
buddy traits: traits designed to work together
overall: generics and traits stop you from braking other code. as long as types are the same, implementation can change

Chapter 12: Operator Overloading

(side note: this isn’t in the book, but operator overloading is hard, a little confusing, and probably something you should read a whole book about before you ever consider doing it. get some domain specific knowledge, and a solid use case, and maybe then you should do it.)
useful for comparing complex structs using ordered comparisons
PartialEq or Eq or Ordering
can specify how index operations like a[i] work using Index and IndexMut
be careful: overloading can be difficult to debug

Chapter 13: Utility Traits

big ones are Drop, Sized, Clone, Copy, Deref, DerefMut, AsRef, AsMut, Borrow, BorrowMut, From , Into, ToOwned
traits can let you break/bend rust’s rules. they can also let you use them properly, depending on how you implement them or derive them
eg: customize how rust drops values of your type by implementing Drop
Copy: can only do if you’re doing shallow copy. no OS handles or anything exotic .
(other individual ones reviewed are good example, but not note-worthy here. look up the docs.)

Chapter 14: Closure

closure: anonymous function expression
capture: use data that belongs to enclosing function by passing it in. eg: |x| println!(x)
borrow, or steal: borrow automatically by reference, steal by moving with move keyword
eg: move |x| println!(x)
can do same things with closures that you do with other expressions, but they don’t have the same types as functions
“[all closures have their own type because they may contain data… so code that works with closures usually needs to be generic]”
Fn plain closure: call multiple times without restriction
FnOnce ensures called once
FnMut contains mutable data or mut references. not safe across threads.
callback: function provided by the user. in Rust these are usually done as closures, usually with lifetimes

Chapter 15: Iterators

iterator: value that produces a sequence of values
Rust ones are “flexible, expressive, efficient”
any value that implements the std::iter::Iterator trait, and/or IntoIterator
code that receives the items is a consumer
most collections types provide methods to produce: 1) shared reference, 2) mutable reference, 3) value
most have drain: takes mutable reference, returns iterator that passes ownership of each element to consumer
adapters: consume one iterator, producing another (eg: map and filter)
adapters are zero-overhead abstraction - cost nothing
(too many examples to list here)
many collections implement the extend trait, allowing one iterable to be combined with another

Chapter 16: Collections

collections: generic types for storing data in memory
mostly use moves to avoid deep-copying values
don’t have invalidation errors: can’t change collection while operating on it because of Rusts borrowing mechanisms
access by reference, or access by copy
Rust lets you borrow mutable references to two or more parts of an array, slice, or vector. safe because Rust dives them into non-overlapping regions.
two slices are equal if they’re the same length, and their corresponding elements are equal
notable: Vec<T>, VecDeque<T>, LinkedList<T>, BinaryHeap<T>, HashMap<K, V>, BTreeMap<K, V>, HashSet<T> , BTreeSet<T>,
Hash, and Hasher work via buddy-trait; “pluggable hashing”

Chapter 17: Strings and Text

String and str are well-formed UTF-8 sequences
ASCII subset of UTF-8
char is 32-bit value holding a Unicode code point
String is just a wrapper around Vec
String supports operator overloading with the Add and AddAssign traits
can use patterns to search, manipulate text
converting other types to human-readable formats through Display trait, then format! macro can format without instruction
(side note: use Cow when you may or may not need to modify text that is borrowed)
string templates must be constant (or be created through macros) in order to be type checked at compile time
(see text formatting online for more details about all formats, arguments, etc.)
debug: \{:?\}, pretty-print: \{:?#\}
rust’s regex uses matches and patterns, so it’s safe for untrusted expressions/text
regular expression compilation is expensive; do once, keep out of loops

Chapter 18: Input and Output

I/O organized around Read, BufRead, Write
Reader -> values that you can read bytes from
Writer -> values you can write bytes to.
UTF-8 is the de facto standard in most Rust code
BufRead basically same as Read but uses chunks, more configurable
readers and writers are closed automatically when dropped - hold them if you want them open
File API uses builder-like pattern to open, create reader, and read
other reader + writer types: io::stdin, io::stdout, io::stderr
(a lot of Path, OS, File, Dir stuff. none of it particularly unique to rust. lookup docs.)

Chapter 19: Concurrency

“[Rust offers a better concurrency model, by not forcing all programs to adopt single style. Unwritten rules are written down, enforced by compiler.]”
fork-join parallelism: simple, avoids bottlenecks, straightforward, easy to reason about
std::thread::spawn new OS thread, just like other languages
move key, or workload via move in spawn: spawn(move || do_process(unit))
moves are cheap; ok to do.
join back together std::thread::Result that has err if child panicked
since panics are per-thread; this is ok.
rust checks closure of spawn: “[has no way of knowing how long child will run, assumes forever. requires lifetime.]”
can have child thread access across thread with Arc<T> (“Atomic Ref. Counter”)
other libraries use scoped threads, worker pools, work-stealing to push out more efficiency
channel: one-way conduit for sending values from one thread to another. sorta like Go.
channels are thread-safe queues
use .send() and .recv(): fail if the other end of channel has been dropped
use move to pass channel to threads. channels are thread safe because they’re for threads
usually loop on the receiver; exit when sender drops
std::sync::mpsc for multi-producer, multi-consumer
implement Sender and Receiver
std::sync::mpmc::sync_channel -> lets you do back pressure so the producers don’t overwhelm consumers
types that implement Send are safe to pass by value to another thread
types that implement Sync are safe to pass by non-mut reference to another thread
struct or enum is Send if its fields are Send, same for Sync
mutex -> lock forces threads to take turns; one thread has access at a time
“support programming w/ invariants: rules that protect data by construction”
commonly Arc for sharing things across threads, Mutex for mutable data shared across threads
mut and Mutex: “mut means exclusive access, non-mut is shared access”
“[Mutex provides exclusive (mut) access, even though some threads might have shared (non-mut) access.]”
Rust can’t protect you from being dumb; can’t stop deadlock
if a thread panics while holding mutex, it’s marked as “poisoned”, attempts to lock will get error
RWLock like mutex but two locks; one for reading, one for writing: “one writer, or many readers, not both”
std::sync::Convar conditional variables for waiting, notifying, when threads need something
std::async::Atomic atomic types for lock-free concurrent programming
when it comes to global state: don’t do it: “[tends to make parts of a program more tightly coupled]”
can use static keyword
can use lazy_static macro, usually with mutex

Chapter 20: Macros

let you extend the language via pseudo-code-gen
each macro call is “expanded” - replace with Rust code
matching patterns to templates
usually use macro_rules! but you could use other ways, or other macros
in Rust compilation process, it expands macros before looking at rest of program
macro patterns are basically regular expressions; use tokens
hard to debug because of expansion, but use rustc to look at code after expansion, use log_syntax! to print macro args
rust has hygienic macros; auto-names variables during expansion so you don’t need to worry about collisions, naming.
any identifiers you need inside a macro should be passed in as parameters
macros always visible to child modules
macros visible to parents through use of “#[export_macro]” to export, or “#[macro_use]” on module import
exported macros shouldn’t rely on anything in scope; macros should use absolute path names to anything they use
syntax errors in macros are fatal, only happen when trying to match fragments. avoid by putting more specific rules first

Chapter 21: Unsafe Code

unsafe code lets you tell Rust to trust you on this one.
use unsafe as keyword, us in fn or block
can use raw pointers and methods to allow unconstrained access to memory, access mutable static variables, use foreign function interface
1. bugs that occur before unsafe block can break contracts, 2) consequences might occur outside unsafe block
good set of rules for Rust programs:
- must not read uninitialized memory
- must not create invalid primitive values
- no references outlive referents shared access is read only, mutable access is exclusive
- must not dereference null, or dangling pointers
- must not use pointers to access memory outside allocation of association
- free of data races
- must not unwind across a call made from another language
- comply with contracts from std library functions
raw pointer unconstrained pointer - Rust can’t tell if you’re using them safely. deref only in unsafe block
two types of raw pointers: *mut T, *const T
. operator doesn’t auto-deref. have to be explicit with (*thing).field
comparison operators use addresses; only equal if they share same address location
“[complete, exact contract for raw pointers is not easily state, and might change]”
null raw pointer is a zero address
foreign function interface -> lets rust call functions written in C, or C++
extern block: declares functions or variables defined in some other library that Rust is linked with