This document describes Ruff's memory management, ownership model, and garbage collection strategy.
Last Updated: January 27, 2026
Version: v0.9.0
- Overview
- Value Ownership Model
- Environment Lifetime Management
- Closure Capture Semantics
- Garbage Collection Strategy
- Memory Safety Guarantees
- LeakyFunctionBody Issue
- Memory Patterns
- Performance Characteristics
- Best Practices
Ruff uses Rust's ownership system for memory safety, combined with reference counting for shared ownership and mutex-based locking for thread safety.
| Type | Ownership | Thread-Safety | Cleanup |
|---|---|---|---|
| Primitives (Int, Float, Bool) | Copy | Yes | Stack |
| Strings, Arrays, Dicts | Clone | Single-thread | Arc for sharing |
| Functions (closures) | Arc<Mutex> | Yes | Reference counting |
| Channels | Arc<Mutex<...>> | Yes | Reference counting |
| Promises | Arc<Mutex<...>> | Yes | Reference counting |
Key Principle: All shared mutable state uses Arc<Mutex<T>> for safe concurrent access.
// src/interpreter/value.rs:243-372
#[derive(Clone)]
pub enum Value {
// Primitives (stack-allocated, cheap to copy)
Int(i64),
Float(f64),
Bool(bool),
Null,
// Heap-allocated collections (cloned when needed)
Str(String),
Array(Vec<Value>),
Dict(HashMap<String, Value>),
// Thread-safe shared types
Channel(Arc<Mutex<(Sender<Value>, Receiver<Value>)>>),
Promise {
receiver: Arc<Mutex<Receiver<Result<Value, String>>>>,
// ...
},
// Functions with optional closure environment
Function(
Vec<String>, // Parameters
LeakyFunctionBody, // Body statements
Option<Arc<Mutex<Environment>>>, // Captured environment
),
// ...
}-
Values are cloned when stored:
x := [1, 2, 3] y := x # Array is cloned, not moved x.push(4) # Both x and y now [1, 2, 3, 4]? No - independent copies! -
Values in collections are cloned:
arr := [x, y, z] # Each value is cloned into array -
Function parameters are cloned:
func process(data) { data.push(10) # Modifies local copy, not original } arr := [1, 2, 3] process(arr) print(arr) # Still [1, 2, 3]
| Value Type | Clone Cost | Notes |
|---|---|---|
| Int, Float, Bool | O(1) | Stack copy |
| Null | O(1) | Zero-size type |
| Str | O(n) | Allocates new string |
| Array | O(n*m) | Deep clone of all elements |
| Dict | O(n*m) | Deep clone of all entries |
| Function | O(1) | Arc clone (ref count increment) |
| Channel | O(1) | Arc clone (ref count increment) |
Note: "Deep clone" means recursively cloning nested structures.
The Environment struct manages variable scopes using a stack of hash maps.
// src/interpreter/environment.rs:28-31
#[derive(Clone, Debug)]
pub struct Environment {
pub scopes: Vec<HashMap<String, Value>>,
}┌───────────────────────────────────┐
│ Function Scope (innermost) │ ◄── scopes[2]
├───────────────────────────────────┤
│ Block Scope │ ◄── scopes[1]
├───────────────────────────────────┤
│ Global Scope (outermost) │ ◄── scopes[0]
└───────────────────────────────────┘
Algorithm (src/interpreter/environment.rs:53-61):
pub fn get(&self, name: &str) -> Option<Value> {
// Search from innermost (end) to outermost (start)
for scope in self.scopes.iter().rev() {
if let Some(value) = scope.get(name) {
return Some(value.clone()); // Clone on read
}
}
None // Variable not found
}Example:
x := 10 # Global scope: scopes[0]["x"] = 10
{
x := 20 # Block scope: scopes[1]["x"] = 20 (shadows global)
print(x) # Prints 20 (finds in scopes[1] first)
}
print(x) # Prints 10 (scopes[1] popped, finds in scopes[0])
Function Call:
fn call_function(&mut self, func: Value, args: Vec<Value>) {
// 1. Push new scope
self.env.push_scope();
// 2. Bind parameters
for (param, arg) in params.iter().zip(args.iter()) {
self.env.define(param.clone(), arg.clone());
}
// 3. Execute function body
self.eval_stmts(&body);
// 4. Pop scope (local variables destroyed)
self.env.pop_scope();
}Lifetime:
┌────────────────────┐
│ Before Call │ scopes: [global]
└────────┬───────────┘
│ push_scope()
▼
┌────────────────────┐
│ During Call │ scopes: [global, function]
└────────┬───────────┘
│ pop_scope()
▼
┌────────────────────┐
│ After Call │ scopes: [global]
└────────────────────┘
Closures capture the environment at definition time, allowing access to outer variables.
Ruff Code:
func make_counter() {
count := 0
# This closure captures 'count' from outer scope
func increment() {
count := count + 1
return count
}
return increment
}
counter := make_counter()
print(counter()) # 1
print(counter()) # 2
print(counter()) # 3
Implementation (src/interpreter/mod.rs:2493-2506):
Expr::Function { params, body, .. } => {
// Anonymous function (closure) captures current environment
Value::Function(
params.clone(),
LeakyFunctionBody::new(body.clone()),
Some(Arc::new(Mutex::new(self.env.clone()))), // Captured!
)
}Process (src/interpreter/mod.rs:1048-1101):
Value::Function(params, body, captured_env) => {
if let Some(closure_env_ref) = captured_env {
// 1. Save current environment
let saved_env = self.env.clone();
// 2. Switch to captured environment
self.env = closure_env_ref.lock().unwrap().clone();
self.env.push_scope(); // New scope for parameters
// 3. Bind parameters
for (param, arg) in params.iter().zip(args.iter()) {
self.env.define(param.clone(), arg.clone());
}
// 4. Execute function body
self.eval_stmts(body.get());
// 5. Pop parameter scope
self.env.pop_scope();
// 6. Write back modified environment
*closure_env_ref.lock().unwrap() = self.env.clone();
// 7. Restore original environment
self.env = saved_env;
}
}┌─────────────────────────────────────────────────┐
│ make_counter() │
│ │
│ ┌──────────────────────────┐ │
│ │ Environment │ │
│ │ ┌──────────────────────┐ │ │
│ │ │ count = 0 │ │ │
│ │ └──────────────────────┘ │ │
│ └────────────┬─────────────┘ │
│ │ │
│ │ Arc<Mutex<Environment>> │
│ │ (shared ownership) │
│ │ │
│ ┌────────────▼─────────────┐ │
│ │ increment (closure) │ │
│ │ - params: [] │ │
│ │ - body: [count+=1; ...] │ │
│ │ - captured_env ───────┐ │ │
│ └────────────────────────┘ │ │
│ │ │
│ │ References │
│ │ outer env │
│ └──────────────────┤
└─────────────────────────────────────────────────┘
After make_counter() returns:
- increment closure survives (Arc keeps Environment alive)
- count variable persists in closure's Environment
- Each call to increment() modifies the same count
Current Implementation: All closures capture by reference (via Arc<Mutex<Environment>>).
Named Functions (do not capture):
func outer() {
x := 10
func inner() {
print(x) # Error: x not defined
}
inner()
}
Anonymous Functions (capture environment):
func outer() {
x := 10
f := func() {
print(x) # OK: closure captures x
}
f() # Prints: 10
}
Ruff uses reference counting (Arc) combined with scope-based destruction.
How it works:
Arc::new(value)creates ref count = 1arc.clone()increments ref count- Dropping
Arcdecrements ref count - When ref count reaches 0, value is freed
Example:
let env = Arc::new(Mutex::new(Environment::new())); // ref count = 1
let closure1 = Value::Function(
params,
body,
Some(env.clone()), // ref count = 2
);
let closure2 = closure1.clone(); // ref count = 3
drop(closure1); // ref count = 2
drop(closure2); // ref count = 1
drop(env); // ref count = 0 → Environment freedAutomatic: Values are destroyed when:
- Variable goes out of scope
- Last reference to Arc is dropped
- Collection containing value is destroyed
Example:
{
x := [1, 2, 3] # Array allocated
y := {"key": "val"} # Dict allocated
# Scope ends here
}
# x and y destroyed, memory freed
Problem: Reference cycles prevent ref count from reaching zero.
Example:
# Potential cycle (not possible in current Ruff, but illustrative)
a := {}
b := {}
a["ref"] := b
b["ref"] := a
# a and b reference each other → ref counts never reach 0
Current Status: Ruff doesn't have explicit cycle detection. Most patterns naturally avoid cycles, but long-running programs with complex data structures could leak memory.
Future Work (Roadmap Task #29): Implement cycle detection or switch to tracing GC.
Root Cause: Deep recursion in Drop trait implementation.
// Function bodies are Vec<Stmt>
pub struct FunctionBody(Vec<Stmt>);
// Stmt contains nested Vec<Stmt>
pub enum Stmt {
If { condition: Expr, then_branch: Vec<Stmt>, else_branch: Vec<Stmt> },
While { condition: Expr, body: Vec<Stmt> },
For { ... body: Vec<Stmt> },
// ... many more with nested Vec<Stmt>
}What happens:
drop(FunctionBody)
└─> drop(Vec<Stmt>)
└─> drop(Stmt::If)
└─> drop(then_branch: Vec<Stmt>)
└─> drop(Stmt::While)
└─> drop(body: Vec<Stmt>)
└─> ... (deep recursion)
Result: Stack overflow during program shutdown for deeply nested code.
LeakyFunctionBody (src/interpreter/value.rs:20-42):
#[derive(Clone)]
pub struct LeakyFunctionBody(ManuallyDrop<Arc<Vec<Stmt>>>);
impl LeakyFunctionBody {
pub fn new(body: Vec<Stmt>) -> Self {
LeakyFunctionBody(ManuallyDrop::new(Arc::new(body)))
}
pub fn get(&self) -> &Vec<Stmt> {
&self.0
}
}
// No Drop implementation → memory leaked
// OS reclaims all memory at program shutdown anywayTrade-off:
- ❌ Memory leaked during execution (functions never freed)
- ✅ No stack overflow
- ✅ OS cleans up at shutdown
- ✅ Acceptable for short-lived programs
1. Iterative Drop (Recommended):
impl Drop for FunctionBody {
fn drop(&mut self) {
let mut stack = vec![self.0.clone()];
while let Some(node) = stack.pop() {
// Extract child nodes, push to stack
// Avoid recursive drop calls
}
}
}2. Arena Allocation:
use typed_arena::Arena;
struct Interpreter {
arena: Arena<Stmt>, // Allocate all Stmts here
}
// Drop entire arena at once (no recursion)3. Flatten AST:
// Instead of nested Vec<Stmt>
pub struct Stmt {
kind: StmtKind,
children_indices: Vec<usize>, // Indices into flat Vec
}Roadmap: Task #29 - Fix LeakyFunctionBody with iterative drop.
Ruff inherits Rust's memory safety guarantees:
✅ Guaranteed:
- No null pointer dereferences (Option instead of null)
- No use-after-free (ownership prevents it)
- No data races (Arc<Mutex> enforces locking)
- No buffer overflows (bounds-checked arrays)
❌ Not Guaranteed (language-level):
- Memory leaks (LeakyFunctionBody issue)
- Reference cycles (no cycle detection yet)
- Infinite loops (halting problem)
All shared mutable state uses Mutex for exclusive access:
// Channel: only one thread can send/receive at a time
Arc<Mutex<(Sender<Value>, Receiver<Value>)>>
// Promise: only one thread can poll at a time
Arc<Mutex<Receiver<Result<Value, String>>>>
// Closure environment: synchronized access
Arc<Mutex<Environment>>Rust's type system prevents:
- Data races at compile time
- Sending non-thread-safe types across threads
- Aliasing mutable references
Use Case: Limit variable lifetime.
{
large_data := read_file("huge.csv") # 1 GB allocated
process(large_data)
} # large_data destroyed, 1 GB freed
next_operation() # Memory available
Use Case: Free memory in long-running loop.
for i in range(1000) {
data := expensive_computation()
use(data)
data := null # Release reference (value destroyed)
# Or re-assign to clear
data := 0 # Old value destroyed
}
Bad (many clones):
arr := range(1000000) # 1M element array
func process(data) {
# ... work with data
}
for i in range(100) {
process(arr) # Clones 1M elements each call!
}
Good (pass indices):
arr := range(1000000)
func process(arr, start, end) {
for i in range(start, end) {
# Work with arr[i]
}
}
for chunk in range(0, 100) {
process(arr, chunk * 10000, (chunk + 1) * 10000) # No clone!
}
Bad (many allocations):
for i in range(1000) {
result := [] # New array each iteration
for j in range(100) {
result.push(compute(i, j))
}
process(result)
}
Good (reuse array):
result := []
for i in range(1000) {
result.clear() # Keep allocation, reset length
for j in range(100) {
result.push(compute(i, j))
}
process(result)
}
| Type | Stack Size | Heap Size | Clone Cost |
|---|---|---|---|
| Int | 8 bytes | 0 | O(1) |
| Float | 8 bytes | 0 | O(1) |
| Bool | 1 byte | 0 | O(1) |
| Str | 24 bytes | n bytes | O(n) |
| Array(n) | 24 bytes | n * sizeof(Value) | O(n*m) |
| Dict(n) | 48 bytes | n * (key + value) | O(n*m) |
| Function | 64 bytes | body size | O(1) (Arc) |
| Channel | 16 bytes | small | O(1) (Arc) |
Note: sizeof(Value) ≈ 64-80 bytes (large enum).
Per-Value Overhead: ~64 bytes (Value enum tag + largest variant)
Example:
arr := [1, 2, 3]
# Memory: 24 (Vec) + 3 * 64 (Values) = ~216 bytes
# Actual data: 3 * 8 = 24 bytes integers
# Overhead: 192 bytes (8x)!
Optimization Ideas (future):
- Specialized
IntArrayfor int-only arrays FloatArrayfor float-only arrays- Smaller Value enum (box large variants)
Stack vs Heap:
- Primitives: Stack (no heap allocation)
- Collections: Heap (Vec, HashMap allocate on heap)
- Functions: Heap (LeakyFunctionBody wraps Arc<Vec>)
Allocation Frequency:
for i in range(1000000) {
x := i # Stack-only, no allocation
}
for i in range(1000000) {
arr := [i] # 1M heap allocations!
}
Bad:
func process_many(items) {
for item in items { # Each iteration clones item!
work_with(item)
}
}
large_array := range(10000)
process_many(large_array) # Clones entire array + each element
Good:
# Process in-place or use indices
func process_many(items) {
for i in range(len(items)) {
work_with(items[i]) # No clone (if items not cloned)
}
}
# Bad
for i in range(1000) {
result := do_work(i)
results.push(result)
}
# Good: Pre-allocate
results := Array.with_capacity(1000) # Reserve space upfront
for i in range(1000) {
results.push(do_work(i))
}
func process_file(path) {
content := read_file(path) # 100 MB
result := parse(content)
content := null # Free 100 MB immediately
# Rest of function with just 'result'
return transform(result)
}
Bad (high clone cost):
tree := {
"left": {
"left": {
"left": { ... } # 10 levels deep
}
}
}
copy := tree # O(n) clone of entire tree!
Good (flatten or use IDs):
nodes := [
{ "id": 0, "left": 1, "right": 2 },
{ "id": 1, "left": 3, "right": 4 },
# ...
]
# Clone is O(1) for IDs, not O(n) for structure
Bad (allocates all values):
nums := range(1000000) # 1M Values allocated
for n in nums {
if n > 100 { break } # Wasted 999,900 allocations!
}
Good (lazy evaluation):
gen nums := range(1000000) # No allocation yet
for n in nums {
if n > 100 { break } # Only created 101 values
}
Bad (captures entire environment):
func make_handler() {
huge_data := load_huge_dataset() # 1 GB
counter := 0
func handler() {
counter := counter + 1 # Only needs counter!
return counter
}
# But closure captures BOTH counter and huge_data!
return handler
}
Good (minimize captures):
func make_handler() {
huge_data := load_huge_dataset()
result := process(huge_data)
huge_data := null # Release before creating closure
counter := 0
func handler() {
counter := counter + 1
return counter
}
# Closure only captures counter now
return handler
}
Possible Cause: Excessive cloning.
Diagnosis:
import time
start := time.now()
for i in range(1000) {
copy := large_data # Suspect: clone here?
process(copy)
}
elapsed := time.now() - start
print("Time: ${elapsed}ms") # If high, likely clone overhead
Possible Cause: Memory leak (reference cycle or leaked closures).
Diagnosis:
- Check for circular references
- Verify closures don't capture too much
- Look for LeakyFunctionBody accumulation
Possible Cause: Deep recursion or LeakyFunctionBody issue.
Diagnosis:
# Check recursion depth
func recursive(n, depth) {
if depth > 1000 {
print("ERROR: Too deep!")
return
}
# ... recursive logic
return recursive(n - 1, depth + 1)
}
- Task #29: Fix LeakyFunctionBody with iterative drop
- Task #30: Separate AST from runtime values (no more LeakyFunctionBody!)
- Cycle Detection: Detect and break reference cycles
- Tracing GC: Optional tracing garbage collector for long-running programs
- Memory Profiler: Built-in memory profiling (
ruff profile --memory)
- Specialized Collections: IntArray, FloatArray (no Value overhead)
- Small Value Optimization: Box large variants, keep small variants inline
- Copy-on-Write: Clone only when mutated
- Arena Allocator: Batch allocate related values
- ARCHITECTURE.md - System architecture overview
- CONCURRENCY.md - Thread safety and concurrency
- EXTENDING.md - Adding native functions
- ROADMAP.md - Planned memory improvements
Questions? Open an issue on GitHub or check the documentation.