21 min read

Rust101

Table of Contents

Ownership

  1. Each value must always have one and only one variable pointing to it, that is its owner.
  2. At the end of scope of the owner, the value is dropped (i.e.) memory is deallocated.
  3. Ownership is moved by default, but can be copied using Copy.

Moving and Copying

  1. Primitive values like int, bool are copied by default.

Borrowing

  1. Borrowing allows temporary references (&T or &mut T) without transferring ownership.
  2. At any given time there can only be either a mutable or an unmutable reference.
  3. A reference must always be valid.

!(Mut + UnMut)

  1. Cannot have mutable and unmutable ref for the same value at the same time.
  2. Cannot have multiple mutable ref, to avoid race mutations to same value, if both ref are used to modify the same value. Avoids corrupting data.
  3. Can have multiple unmutable ref.

Invalid reference

Following fails because “a” is out of scope at the end of dangling fn and will be cleared. So a reference to a non existing value is not possible. (i.e.) cannot borrow from dropped value.

fn main() {
    let aref = dangling();
}

fn dangling() -> &String {
    let a = String:from("Hello");

    &a;
}

Slice

fn main() {
    let s = String::from("Hello world");
    let fw = first_word(&s);
    println!("{}", fw);
}

fn first_word(s: &String) -> &str {
    for (i, &item) in s.as_bytes().iter().enumerate() {
        if item == b' ' {
            return &s[..i];
        }
    }

    return &s[..];
}

Why &s[..] instead of s[..]?

  • s[..] means “slice from 0 to end”

  • s[..] derefs s: &String to String, then calls - Deref<Target=str>

  • So s[..] gives a str (unsized)

  • Wrapping with &: &s[..] is a &str (sized, usable)

  • String owns heap memory

  • &String borrows the whole struct

  • &str borrows just the actual text (slice of bytes)

  • You need &s[..] to convert a &String to a &str

ExpressionTypeNotes
s&StringReference to the whole String
*sStringDereferenced String
(*s)[..]strUnderlying string slice
&s[..]&strReference to string slice ✅

Modules

Scenario 1: Seprate modul.rs file

  • DO NOT wrap it in mod modul { ... }
  • Instead, declare it in main.rs using mod modul;
  • Then use it with use modul::… or modul::…

Layout

With this example layout

src/
├── main.rs
└── modul.rs
mod modul; // loads src/modul.rs

use modul::Breakfast;

fn main() {
    let summer_bf = Breakfast::summer_menu();
    println!("{:?}", summer_bf);
}
#[derive(Debug)]
pub struct Breakfast {
    pub food: String,
    pub drink: String,
}

impl Breakfast {
    pub fn summer_menu() -> Breakfast {
        Breakfast {
            food: "Toast".into(),
            drink: "Orange juice".into(),
        }
    }
}

mod modul; tells the compiler to load the external file, and the file must not declare mod modul again — that would be redundant and cause a nested path like modul::modul::Breakfast.

Scenario 2: You define the module inline inside main.rs

  • DO use mod modul { ... } directly
  • No need for mod modul; or separate file
  • You can still use modul::… if you like
mod modul {
    #[derive(Debug)]
    pub struct Breakfast {
        pub food: String,
        pub drink: String,
    }

    impl Breakfast {
        pub fn summer_menu() -> Breakfast {
            Breakfast {
                food: "Toast".into(),
                drink: "Orange juice".into(),
            }
        }
    }
}

use modul::Breakfast;

fn main() {
    let summer_bf = Breakfast::summer_menu();
    println!("{:?}", summer_bf);
}

Seprate folder sub modules

src/ ├──outermodul/ | └── innermodul.rs ├── outermodul.rs —> use innermodul; └── main.rs —> use outermodul;

If there is an inner modul, that needs to present in a folder with same name as the outermodul. This can then be access by outermodul.

If adding a bunch of utils to same folder

rust expects mod.rs file in directory.

src/
└── utils/
    ├── mod.rs
    ├── math.rs
    ├── stringconcat.rs
    └── extracttoken.rs

Other files that we want to import into other files must be brought into scope in the mod.rs file.

pub mod math;
pub mod stringconcat;
pub mod extracttoken;

In main.rs

mod utils;  // loads utils/mod.rs and submodules

fn main() {
    // To access math and concat functions:
    let sum = utils::math::add(1, 2);
    let combined = utils::stringconcat::concat_str("foo", "bar");
}
ScenarioHow to declare submodules in mod.rsHow to declare in main.rsHow to use modules
Folder with mod.rspub mod math; pub mod concat; in mod.rsmod utils; in main.rsutils::math::func()
Flat files no mod.rs and no foderN/Amod math; mod concat; in main.rsmath::func(), concat::func()

Copy and move

In the following example, both

  1. println!("{}", &v[0]);
  2. println!("{}", v[0]);
    will work.
pub fn collections() {
    let mut v: Vec<i32> = Vec::new();
    let v2 = vec![1, 2, 3, 4];

    v.push(1);

    println!("{}", &v[0]);
}

When using v[0] or &v[0]

  1. i32 by default “copies” to another variable when it is assigned.

  2. String and structs when assign to another var need to be passed as ref, else they will be “moved” on default. Can be cloned with .clone(). Compile err when trying to access the same after move, if not passed as ref initially.

  3. println accepts ref of variable. println!("{}", x); will auto-borrow x if it implements Display by reference. E.g. String, Vec, custom structs, if v[0] is used. If not, it will accept a clone (internally), If String is passed, will auto deref to &String which implements Display interface.

  4. println!("{}", v[0]); will work even if vector had a string at index 0, because by default [] indexing fn returns the reference not the owner and the auto deref will automatically use the &String from String.

  5. But let s = vStr[0]; will not compile if idx 0 is string

    1. We get a reference like before, but String type cant copy by default
    2. But moving from a vector isnt allowerd.
  6. This is why let a = vStr[0]; doesnt work but both println!("{}", &v[0]) and println!("{}", v[0]) work.

From ChatGPT:

CodeBorrowed?Moved?Why it works
println!("{}", v[0])✅ yes❌ nov[0] returns reference; auto-borrowed
println!("{}", &v[0])✅ yes❌ noExplicitly borrowed
println!("{}", v[0].clone())❌ no✅ yesClones and moves the cloned value
let x = v[0];❌ no❌ no❌ Fails for non-Copy types like String

Generics With more Ownership understandings

Functions accessible based on their traits

Below is an example of a generic scenario where based on the type/ trait of the incoming type, the object gets access to different methods.

#[derive(Debug)]
struct Point<T,U>  {
    x: T,
    y: U,
}

impl<T: Copy, U> Point<T,U> { // get_x is accessible only for variables with copy trait
    fn get_x(&self) -> T {
        self.x
    }
}

impl<T> Point<T,f64> {
    fn get_floaty(&self) -> f64 {
        self.y
    }
}

fn enumgenr() {
    let flotPt = Point { x: 5, y: 3.5 };
    flotPt.get_floaty();
    flotPt.get_x();
    let intPt = Point { x: 5, y: 3 };
    // intPt.get_floaty(); // wont work
    intPt.get_x();
}

Complex Generic usecase

We try to combine two different generics and create an output combining their types.

Calling function to give context

  1. All points contains x and y
  2. There are two examples here one with a method for copy trait and another with clone trait.
  3. Both examples have two points each, and combine x from pt1 and y from pt2.
fn mixup() {
    // Copy trait
    let a = Point {x: 1, y: 3.0};
    let b = Point {x: "Hello", y: 'c'};

    let c = a.mix(b);
    println!("{:?}", c);
    println!("{:?}", a);

    // Clone trait
    let d = Point {x: String::from("tasd"), y: 3.0};
    let e = Point {x: String::from("Hello"), y: 'c'};

    let f = a.mix(b);
    println!("{:?}", f);
    println!("{:?}", d);
}

Copy Trait

  1. Supports T with copy trait, e.g. int, char, static str (stored on stack, fixed size).
  2. &self is a ref. Always copies. If “self” instead of “&self” was used, it means that struct “a” is moved into the method.
  3. Which means a is no longer usable after this fn call. This is not because of the copy itself but simply how ownership is transferred if reference isn’t passed.
impl <T: Copy, U> Point<T, U> {
    fn mix<V, W>(&self, pt: Point<V, W>) -> Point<T, W> {
        Point {x: self.x, y: pt.y }
    }
}

Clone Trait

  1. Support T with clone trait. e.g. String, heap allocated, can grow.
  2. Here the value needs to be explicitly cloned since its not allowed to be moved out of a reference.
  3. Since &self is a shared self reference.
impl <T: Clone, U> Point<T, U> {
    fn mix<V, W>(&self, pt: Point<V, W>) -> Point<T, W> {
        Point {x: self.x.clone(), y: pt.y }
    }
}

Clone trait without &self

  1. Support T with Clone trait, e.g. String (heap-allocated, not Copy).
  2. This uses self instead of &self, meaning the method takes ownership of the whole struct.
  3. Because self is owned (not a reference), we are allowed to move x out of it directly.
  4. No need to clone here, even though T isn’t Copy, because the method owns the value and value is moved.
  5. Any method trying to access the struct after passing to method will cause a compiler error
impl <T: Clone, U> Point<T, U> {
    fn mix<V, W>(self, pt: Point<V, W>) -> Point<T, W> {
        Point {x: self.x, y: pt.y }
    }
}

&x.y always means “take a reference to x.y”, regardless of whether x is already a reference or not.

Lifetimes

Manual generic lifetime annotation is necessary only if rust compiler coudln’t categorise our scenario into one of the below rules.

  1. The first rule is that the compiler assigns a different lifetime parameter to each lifetime in each input type. fn foo<'a, 'b>(x: &'a i32, y: &'b i32) (Note the signature has no return type, if there was we would have an ambiguity requiring manual annotation)

  2. The second rule is that, if there is exactly one input lifetime parameter, that lifetime is assigned to all output lifetime parameters: fn foo<'a>(x: &'a i32) -> &'a i32

  3. The third rule is that, if there are multiple input lifetime parameters, but one of them is &self or &mut self because this is a method, the lifetime of self is assigned to all output lifetime parameters.

pub fn main() {
    let x = String::from("aaaa");
    let y = String::from("aa");
    longest(&x, &y);
}

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}
  1. Both inputs (longest fn) are references

  2. Rust doesnt like dangling references/ ref with no value attached to them, because the value might have been destroyed or out of scope.

  3. So in example, The function will always return either x or y — so a valid reference is being returned. The compiler doesn’t know whether the returned reference will have the lifetime of x or y, and that’s why explicit lifetime annotation is required.

  4. So we tell it to use a generic lifetime (‘a)

  5. Generic liftime annotations define the relationship between the lifetimes of multiple references and how they relate to each other.

  6. So it doesnt change the lifetime, just lets all the references in scope understand how they relate to each other.

  7. By adding the same lifetime 'a to all ref and return type, we are creating a relationship fn longest<'a>(x: &'a str, y: &'a str) -> &'a str.

  8. The relationship is that, all ref and return will have same lifetime.

  9. It doesn’t change the lifetime, but just assumes that the lifetime is the smallest lifetime out of all the arguments.

Following will err

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

fn main() {
    let string1 = String::from("long string is long");
    let result;
    {
        let string2 = String::from("xyz");
        result = longest(&string1, &string2);
    }
    println!("The longest string is '{result}'");
}
  1. Lifetime of input refs and return is same as the lifetime of ref with shortest lifetime.
  2. string2 has shortest lifetime.
  3. When trying to access result after string2’s lifetime expired compiler throws an err. string2 does not live long enough

Same rules apply to Structs holding references

struct Book<'a> {
    author: &'a str,
    title: &'a str,
}

pub fn main() {
    let book = Book {
        author: "George Orwell",
        title: "1984",
    };

    println!("{} by {}", book.title, book.author);
}

All string literals have 'static lifetimes, since they are stored in prpgrams binary as static strings.

Smart Pointers

  • Smart pointers are not simply references to values in memory, but could be datastructures with additional overhead
  • In many cases smart pointers own the data they reference. E.g Strings and Vectors
  • They are structs that implement Deref and Drop traits
    • Deref Trait: Code that specifies what happens when pointer is deref. Lets the struct be used like pointer. Example: Box<T>, Rc<T>, and String all implement Deref to behave like pointers/references to their inner data (T or str)
    • Drop Trait: Like destructor. Specify what happens when smart pointer goes out of scope.

Box<T>

  • When sz of type isnt known at compile time. E.g. tree, linkedlist are recursive data structures and their actual size cant be known at compile time.
  • When we need to transfer ownership of large amounts of data without copying.
  • When we want to hold a value as long as it has a specific trait, and not care about the type. E.g. Box<dyn Error>, holds value of type that implements Error trait, but its type could be anything.

RefCell

  • RefCell<T> enables interior mutability: modifying data even when the outer variable is immutable.
  • Borrowing rules (exclusive mutable or multiple immutable borrows) are checked at runtime instead of compile time.

  • We wrap rc over refcell because rc doesnt let us pass a mut ref.
  • Once we deref rc, we can then get access to runtime mutable ref using refcell.
  • Else rc would want us to either
    • clone it, then it wont mut same value
    • or move the value in which case we will lose the “value” ref that we currently have.
let value = Rc::new(RefCell::new(123));
let aa = Rc::new(ConsRef(Rc::clone(&value), Rc::new(Nil)));

let bb = Cons(111, Rc::clone(&aa));
let cc = Cons(222, Rc::clone(&aa));

println!("Count of ref to aa: {}", Rc::strong_count(&aa));

println!("{:#?}", bb);
*value.borrow_mut() = 25;
println!("{:#?}", bb);

Weak pointers

  1. Rc - shared ref
  2. Refcell - runtime mutability
  3. Node struct(below) can have parent and vec of child
  4. ::new method that creates default values for parent and child on Node creation

In main

  1. Create two nodes, parent and child
  2. Push child into Parent’s vector using runtime refcell deref
  3. Weak from child to parent using same refcell deref.

Implications

  1. Child cant keep parent alive, (i.e) if parent is out of scope, child is also gone. Weak ref will not stop this from happening or linger on heap as orphan. Created using Rc::downgrade
  2. Parent is a strong ref, hence owns the children.
struct Node {
    name: String,
    parent: RefCell<Weak<Node>>,
    children: RefCell<Vec<Rc<Node>>>
}

impl Node {
    fn new(name: &str) -> Node {
        Node {
            name: name.to_string(),
            parent: RefCell::new(Weak::new()),
            children: RefCell::new(vec![])
        }
    }
}

fn main() {
    let parent = Rc::new(Node::new("new"));
    let child = Rc::new(Node::new("child"));

    parent.children.borrow_mut().push(Rc::clone(&child));
    println!("After adding child");
    println!("Parent Weak: {} Strong: {}", Rc::weak_count(&parent),  Rc::strong_count(&parent));
    println!("Child Weak: {} Strong: {}", Rc::weak_count(&child),  Rc::strong_count(&child));

    *child.parent.borrow_mut() = Rc::downgrade(&parent);
    println!("After adding weak ref");
    println!("Parent Weak: {} Strong: {}", Rc::weak_count(&parent),  Rc::strong_count(&parent));
    println!("Child Weak: {} Strong: {}", Rc::weak_count(&child),  Rc::strong_count(&child));

}

Concurrency

Concurrent Different parts of a program make progress independently, but not necessarily at the same time. Tasks can overlap in time (e.g., interleaved execution on a single CPU core).

Example:

  1. A web server handling multiple requests “simultaneously” by switching between them rapidly (even on a single core).
  2. Async programming in Rust (tokio, async/await) where tasks yield control while waiting for I/O.

Parallel Different parts of a program execute simultaneously (at the exact same time). Requires multiple CPU cores/threads.

Example:

  1. Running matrix multiplication on 4 CPU cores.

Threads

  1. Os run app in a process.
  2. That process can spawn multiple threads.
  3. Each thread can run simultaneously doing different work.
  4. Execution and picking task is non deterministic.
  5. Can lead to
    • Race conditions - two or more threads accessing/ modifying data in an unpredictable way.
    • Dead locks - two or more threads are depending on other thread to complete, causing both to wait indefinitely.

Message passing

  • Instead of using shared memory to communicate, communicate to share memory.
  • Use channel to send messages between threads.

Shared state Concurrency

We can either

  • Pass the state from one thread to another by moving the whole state.
  • Concurrently access the same state variable without moving it another thread.

Mutex

  • Mutual exclusion - Only one thread can access data/ state at any point in time.
  • Get access to state by getting the mutex lock.
  • Once the lock )MutexGuard) goes out of scope it gives back the lock. Other threads can access it afterwards.

An example of mutex where we use a scope after locking the mutex for access. Mutex { data: 6, poisoned: false, .. }

  • Locking done inside a scope. Creates a MutexGuard.
  • Once that guard is out of scope, the guard is dropped.
  • So it can be picked by another thread.
fn mutx() {
    let m = Mutex::new(5);
    {
        let mut num: std::sync::MutexGuard<'_, i32> = m.lock().unwrap();
        *num = 6;
    }
    println!("{:?}", m);
}

An example of mutex without scope and accessing mutex in same scope, without releasing the lock. Mutex { data: <locked>, poisoned: false, .. }

  • Output says data is locked, because its not possible to lock the same mutex again even with the same thread.
  • Since we have unlocked the mutex, and rust doesnt know what value will be in the future, it defaults to showing the locked status for the value.
fn mutx() {
    let m = Mutex::new(5);
    let mut num: std::sync::MutexGuard<'_, i32> = m.lock().unwrap();
    *num = 6;
    println!("{:?}", m);
}

Arc

Arc<T> (Atomic Reference Counting) is a thread-safe version of Rc<T>. It gives access to shared ownership of data across threads by cloning immutable references.

Without Arc

let m = Mutex::new(0);
spawn(move || { /* m moved here */ }); 
spawn(move || { }); // Error: Can't move `m` into multiple threads.
  • Each thread’s closure moves the value

    • Causing ownership move of the mutex in this case.
    • Then the next thread cant use the same mutex/ shared state.
  • Wrap mutex in an Arc for shared ownership let m = Arc::new(Mutex::new(0));

  • Then each thread can clone the arc (not the mutex).let m_arc = Arc::clone(&m);

  • Threads can use this to access the shared mutex without moving the actual mutex.

  • The Mutex ensures exclusive access during mutation let mut g = m_arc.lock().unwrap();

Spawn 10 threads that inc mutex value by 1.

fn mutx_arc() {
    
    let m = Arc::new(Mutex::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let m_arc = Arc::clone(&m);
        let handle = spawn(move || {
            let mut g = m_arc.lock().unwrap();
            *g += 1;
        });

        handles.push(handle);
    }

    for h in handles {
        h.join().unwrap();
    }

    println!("{:#?}", m);
}

Async Await

  • async method returns a type that implements future trait
  • Future is a state machine that can be Polled.
    • Tracks its state internally (e.g., “not started”, “waiting I/O”, “completed”).
  • Future trait has two components
    • The Output<T>
    • A poll method
  • On each poll returns an enum with two states
    • Pending - Continues waiting to get value
    • Ready - Once data is ready, it must call the Wake callback to notify the executor.
  • Futures are lazy, so unless awaited or given to an executor nothing will happen.

Tokio

  • When calling a fn x thats async that might contain multiple async/await methods inside it,

    • That x function will have a state machine that has multiple states within it.
    • Once each inner await is completed and their respective outputs are obtained, the states change from pending to complete
    • Then as a whole the output is obtained on the fn that polls or awaits it.
    • Compilers state machine below
    enum XFuture {
        State0,                // Initial state
        State1(DBFuture),      // Waiting for first `call_db()`
        State2(String),        // Ready to print first result
        State3(DBFuture),      // Waiting for second `call_db()`
        State4(String),        // Ready to print second result
        Completed,             // Final state
    }
    • Each .await is a state transition (e.g., State0 → State1 when first call_db() starts).
    • The executor polls this future repeatedly, advancing the state machine.
  • We need an async runtime to execute async await, without this, we can add async keyword to main fn.

  • So we use an async runtime like tokio provide a runtime.

#[tokio::main]
async fn main() {
    x().await;
}

async fn x() {
    let v = call_db().await;  // State 1: Waiting for `call_db()`
    println!("Db call 1 {}", v);  // State 2: Print first result

    let v2 = call_db().await;  // State 3: Waiting for `call_db()` again
    println!("Db call 2 {}", v2);  // State 4: Print second result
}

async fn call_db() -> String {
    "database value".to_owned()
}

Tokio task

  • “green thread” refers to a lightweight, user-space thread managed by the Tokio runtime (not the OS).
  • Tasks help top level futures e.g. fn x, to execute in parallel.
  • OS threads > # of CPU cores, but this will cause switching between tasks. If there are too many threads, it causes resource starvation.
FeatureTokio Task (Green Thread)OS Thread
Managed byTokio runtime (user-space)OS kernel
Memory overhead~1-2KB per task~1-10MB per thread
Creation costExtremely cheap (nanoseconds)Expensive (microseconds)
SchedulingCooperative (non-preemptive)Preemptive (OS-controlled)
ParallelismRuns on thread pool (configurable)1:1 with kernel threads

Task execution

#[tokio::main]
async fn main() {
    let task1 = tokio::spawn(async {
        call_db().await; // (A)
        call_api().await; // (B)
    });

    let task2 = tokio::spawn(async {
        call_other_service().await; // (C)
    });

    tokio::join!(task1, task2);
}
  • Single-Threaded Runtime: (Concurrency)

    1. Time 0-1ms: Task1 (call_db) → Pending (yields)
    2. Time 1-2ms: Task2 (call_other_service) → Ready
    3. Time 2-3ms: Task1 (call_api) → Ready
  • Multi-Threaded Runtime (2 cores): (Parallelism)

    1. Core 1: Task1 (call_db) ──────→ Task1 (call_api)
    2. Core 2: Task2 (call_other_service) ──→ Done
  • Within a task: .await enforces serial execution.

  • Across tasks:

    • Single-core → Concurrent (time-sliced).
    • Multi-core → Parallel + Concurrent.
  • Tokio optimizes I/O: Thousands of tasks can “wait” without blocking threads.

Send and Sync traits - TO DO

Source