Memory Allocation in Systems: A Comprehensive Guide

(Compressed from our previous discussions)


1. High-Level Overview

What is a Memory Allocator?

  • Manages heap memory for programs.
  • Handles alloc() (give memory) and free() (return memory).
  • Tracks which memory is used/free to avoid overlaps.

Key Concepts

  • Stack vs Heap:
    • Stack: Fast, fixed-size (primitives, local vars).
    • Heap: Dynamic, slower (Box, Vec, String).
  • Fragmentation: Wasted space from small gaps between allocations.

2. How Allocation Works in Rust

Default Allocator

  • Uses GlobalAlloc trait (delegates to OS allocator).
  • On Linux: Calls malloc/free (from libc).

Example: Vec Allocation

#![allow(unused)]
fn main() {
let v = Vec::with_capacity(10); // Asks allocator for memory  
}

Steps:

  1. Rust → GlobalAlloc::alloc()libc::malloc().
  2. mallocbrk/mmap syscall → Linux kernel.
  3. Kernel assigns virtual memory pages.

3. OS & Hardware Interaction

Syscalls (Userspace → Kernel)

  • brk: Grows heap segment.
  • mmap: Allocates arbitrary memory (used for large allocations).

CPU & RAM Electrical Signals

  • Address Bus: CPU sends address (e.g., 64-bit for DDR4).
  • Command Signals:
    • RAS# (Row Address Strobe).
    • CAS# (Column Address Strobe).
  • Data Transfer:
    • 64-bit data bus + DQS (data strobe) for timing.
    • DDR4: 1.2V signaling, ~3.2 GT/s transfer rate.

Key Insight: "Allocation" is just marking memory as usable; actual electrical activity happens on first access.


4. Custom Allocators in Rust

Why?

  • Avoid fragmentation.
  • Reduce latency (e.g., HFT, game engines).

Example: Bump Allocator

#![allow(unused)]
fn main() {
use std::alloc::{GlobalAlloc, Layout};

struct BumpAllocator(/* internal buffer */);

unsafe impl GlobalAlloc for BumpAllocator {
    fn alloc(&self, layout: Layout) -> *mut u8 {
        // Simple pointer bump (no reuse)
    }
    // ...
}
}

Use Cases:

  • Arena allocators (batch free all memory).
  • Slab allocators (fixed-size blocks).

5. HFT-Specific Optimizations

What Matters for Low Latency?

  1. Cache Awareness

    • Avoid false sharing (pad data to cache lines).
    • Prefer Struct-of-Arrays (SoA) over Array-of-Structs (AoS).
  2. Allocation-Free Hot Paths

    #![allow(unused)]
    fn main() {
    // Bad: Allocates in loop  
    let mut v = Vec::new();  
    for i in 0..100_000 { v.push(i); }  
    
    // Good: Pre-allocate  
    let mut v = Vec::with_capacity(100_000);  
    }
  3. Measurement Tools

    • perf stat: Cache misses, page faults.
    • strace: Syscall tracing.

6. Key Takeaways

LayerKey Idea
RustUses GlobalAlloclibc → Syscalls
OSManages virtual memory via mmap/brk
HardwareDRAM accessed via RAS/CAS, 1.2V signals
HFTPre-allocate, mind caches, avoid syscalls

Further Learning

  • Books: “Systems Performance” (Brendan Gregg).
  • Crates: jemallocator, bumpalo.
  • Linux: man brk, man mmap.

This document covers all layers (Rust → OS → Hardware) concisely. Let me know if you’d like expansions on any section!