Memory Allocation in Systems: A Comprehensive Guide
(Compressed from our previous discussions)
1. High-Level Overview
What is a Memory Allocator?
- Manages heap memory for programs.
- Handles
alloc()(give memory) andfree()(return memory). - Tracks which memory is used/free to avoid overlaps.
Key Concepts
- Stack vs Heap:
- Stack: Fast, fixed-size (primitives, local vars).
- Heap: Dynamic, slower (
Box,Vec,String).
- Fragmentation: Wasted space from small gaps between allocations.
2. How Allocation Works in Rust
Default Allocator
- Uses
GlobalAlloctrait (delegates to OS allocator). - On Linux: Calls
malloc/free(fromlibc).
Example: Vec Allocation
#![allow(unused)] fn main() { let v = Vec::with_capacity(10); // Asks allocator for memory }
Steps:
- Rust →
GlobalAlloc::alloc()→libc::malloc(). malloc→brk/mmapsyscall → Linux kernel.- Kernel assigns virtual memory pages.
3. OS & Hardware Interaction
Syscalls (Userspace → Kernel)
brk: Grows heap segment.mmap: Allocates arbitrary memory (used for large allocations).
CPU & RAM Electrical Signals
- Address Bus: CPU sends address (e.g., 64-bit for DDR4).
- Command Signals:
RAS#(Row Address Strobe).CAS#(Column Address Strobe).
- Data Transfer:
- 64-bit data bus +
DQS(data strobe) for timing. - DDR4: 1.2V signaling, ~3.2 GT/s transfer rate.
- 64-bit data bus +
Key Insight: "Allocation" is just marking memory as usable; actual electrical activity happens on first access.
4. Custom Allocators in Rust
Why?
- Avoid fragmentation.
- Reduce latency (e.g., HFT, game engines).
Example: Bump Allocator
#![allow(unused)] fn main() { use std::alloc::{GlobalAlloc, Layout}; struct BumpAllocator(/* internal buffer */); unsafe impl GlobalAlloc for BumpAllocator { fn alloc(&self, layout: Layout) -> *mut u8 { // Simple pointer bump (no reuse) } // ... } }
Use Cases:
- Arena allocators (batch free all memory).
- Slab allocators (fixed-size blocks).
5. HFT-Specific Optimizations
What Matters for Low Latency?
-
Cache Awareness
- Avoid false sharing (pad data to cache lines).
- Prefer Struct-of-Arrays (SoA) over Array-of-Structs (AoS).
-
Allocation-Free Hot Paths
#![allow(unused)] fn main() { // Bad: Allocates in loop let mut v = Vec::new(); for i in 0..100_000 { v.push(i); } // Good: Pre-allocate let mut v = Vec::with_capacity(100_000); } -
Measurement Tools
perf stat: Cache misses, page faults.strace: Syscall tracing.
6. Key Takeaways
| Layer | Key Idea |
|---|---|
| Rust | Uses GlobalAlloc → libc → Syscalls |
| OS | Manages virtual memory via mmap/brk |
| Hardware | DRAM accessed via RAS/CAS, 1.2V signals |
| HFT | Pre-allocate, mind caches, avoid syscalls |
Further Learning
- Books: “Systems Performance” (Brendan Gregg).
- Crates:
jemallocator,bumpalo. - Linux:
man brk,man mmap.
This document covers all layers (Rust → OS → Hardware) concisely. Let me know if you’d like expansions on any section!