Asteroid Dodger: Concurrency

advanced Rustconcurrencyrayongame devoptimization
0 / 0

This tutorial picks up where the large map and minimap tutorial left off. With a 4000x4000 world and hundreds of asteroids, the update loop does real work each frame. This tutorial introduces Rust’s concurrency model by parallelizing asteroid updates and collision pair generation using rayon — a library that makes data parallelism nearly effortless.

What you’ll add

Milestones overview

Milestone 1Parallel Iterationrayon, par_iter_mut, Send and Sync
Milestone 2Parallel CollisionGrid pair generation, when not to parallelize

Prerequisites


Milestone 1 — Parallel iteration with rayon

Milestone 1 of 2

Step 1 — Add rayon and parallelize asteroid updates

Add rayon to the project:

cargo add rayon

This adds rayon = "1" to Cargo.toml. Rayon provides a work-stealing thread pool and drop-in parallel iterators. It spawns one thread per CPU core, and distributes work across them automatically. You don’t manage threads, locks, or channels — you swap .iter() for .par_iter() and rayon handles the rest.

In src/game.rs, add the rayon prelude:

use rayon::prelude::*;

Find the asteroid update loop in Game::update. It currently looks like this:

for asteroid in &mut self.asteroids {
    asteroid.update(dt);
}

Replace it with:

self.asteroids.par_iter_mut().for_each(|asteroid| {
    asteroid.update(dt);
});

par_iter_mut() splits the slice across rayon’s thread pool. Each asteroid’s update method only reads and writes its own pos and vel fields — no shared state, no aliasing. The borrow checker already enforces this: &mut Asteroid gives exclusive access to one asteroid, so two threads can never touch the same one.

Rayon requires the closure to be Send — meaning it can safely move to another thread. &mut Asteroid is Send because Asteroid contains only Vec2, f32, Vec<Vec2>, and u8. These are all plain data with no interior mutability or reference counting. The compiler checks this at build time; if it’s not safe, it won’t compile.

Step 2 — Send and Sync

Two marker traits govern Rust’s concurrency safety:

Most Rust types implement both automatically. The compiler derives them unless a type contains something inherently thread-unsafe. Here’s what would break:

use std::rc::Rc;

pub struct Asteroid {
    pub pos: Vec2,
    pub vel: Vec2,
    pub radius: f32,
    pub vertices: Vec<Vec2>,
    pub generation: u8,
    pub label: Rc<String>,  // Rc is NOT Send
}

With that Rc field, Asteroid is no longer Send. Calling par_iter_mut() on a Vec<Asteroid> would fail with a compile error like:

error[E0277]: `Rc<String>` cannot be sent between threads safely

This is “fearless concurrency” — the compiler prevents data races at compile time, not at runtime. You never get a silent corruption bug. If it compiles, it’s thread-safe.

Rc vs Arc

Rc<T> uses non-atomic reference counting — fast but single-threaded only. Arc<T> uses atomic reference counting — slightly slower but Send + Sync. If you ever need shared ownership across threads, swap Rc for Arc. In this game, asteroids own their data outright, so neither is needed.


Milestone 2 — Parallel collision

Milestone 2 of 2

Step 3 — Parallel pair generation

The spatial grid’s potential_pairs() method iterates cells sequentially, collecting pairs of asteroid indices that share a cell. With a large world and many cells, this is a good candidate for parallelism — each cell can be processed independently.

Add a new method to your Grid implementation in src/collision.rs:

pub fn potential_pairs_parallel(&self) -> Vec<(usize, usize)> {
    use rayon::prelude::*;
    let cell_pairs: Vec<Vec<(usize, usize)>> = self.cells.par_iter()
        .map(|cell| {
            let mut pairs = Vec::new();
            for a in 0..cell.len() {
                for b in (a + 1)..cell.len() {
                    let pair = if cell[a] < cell[b] {
                        (cell[a], cell[b])
                    } else {
                        (cell[b], cell[a])
                    };
                    pairs.push(pair);
                }
            }
            pairs
        })
        .collect();
    let mut all_pairs: Vec<(usize, usize)> = cell_pairs.into_iter().flatten().collect();
    all_pairs.sort_unstable();
    all_pairs.dedup();
    all_pairs
}

The pattern: each cell produces its own Vec of pairs. No shared mutable state — each closure works with an independent local Vec. After all cells are processed, flatten merges the results into a single list.

Asteroids that span multiple cells appear in multiple cell lists, creating duplicate pairs. The canonical ordering (if cell[a] < cell[b]) ensures duplicates are identical, then sort_unstable + dedup removes them. This is faster than checking contains() on every insert — contains() is O(n) per check, while sort + dedup is O(n log n) total.

In Game::update, replace the potential_pairs() call with potential_pairs_parallel():

let pairs = self.grid.potential_pairs_parallel();

Step 4 — When NOT to parallelize

Not everything benefits from threads. Here are the cases in this game where parallelism doesn’t help or actively hurts.

The game loop is sequential. Each frame runs update, then draw, then next_frame().await. These steps depend on each other — draw reads the state that update just wrote. This pipeline stays single-threaded.

Drawing is not thread-safe. macroquad’s draw functions (draw_circle, draw_triangle, draw_line) use global state internally. Calling them from multiple threads would corrupt that state. All rendering and input handling must happen on the main thread.

macroquad’s main thread requirement

macroquad uses thread-local storage for its rendering context. All draw calls and input reads (is_key_down, mouse_position) must happen on the main thread. Only pure computation — physics, collision math, AI, pathfinding — benefits from parallelism.

Collision resolution can’t be trivially parallelized. The resolve_collisions function processes pairs sequentially because each collision modifies both asteroids in the pair. Two pairs might share an index — asteroid 5 might collide with both asteroid 3 and asteroid 7 in the same frame. Processing those pairs in parallel would create a data race on asteroid 5’s position and velocity. The compiler would reject this: you can’t have two &mut references to the same asteroid simultaneously.

Small workloads don’t benefit. Rayon’s thread pool has overhead — distributing work, synchronizing results, and cache effects from data moving between cores. For fewer than ~100 asteroids, the sequential loop finishes faster than rayon can distribute the work. The parallel version pays off at 500+ asteroids with the spatial grid, which is exactly where the large map pushes you.

Measuring the difference. Add timing to see the actual impact:

use std::time::Instant;

// In Game::update, around the asteroid update loop:
let t0 = Instant::now();
self.asteroids.par_iter_mut().for_each(|asteroid| {
    asteroid.update(dt);
});
let update_us = t0.elapsed().as_micros();

let t1 = Instant::now();
let pairs = self.grid.potential_pairs_parallel();
let pairs_us = t1.elapsed().as_micros();

if get_time() as u64 % 60 == 0 {
    println!("asteroid update: {}us, pair gen: {}us, asteroids: {}",
        update_us, pairs_us, self.asteroids.len());
}

Run with 500+ asteroids (increase the spawn cap) and compare against the sequential versions. On a 4+ core machine, you should see pair generation drop to roughly half the sequential time. The asteroid update improvement is smaller because update() does so little work per asteroid — the parallelism overhead is a larger fraction of the total.

Remove the timing code once you’ve confirmed the results. It’s a debugging tool, not a permanent feature.


Next steps

This completes the asteroid dodger series. Across all 14 tutorials, you’ve built:

Some directions to explore next: