Skip to content

Commit 1ba26df

Browse files
committed
docs: update benchmarks/ directory with verified v1.1.0 numbers
Update deprecated benchmarks directory documentation to reflect verified performance: - SPSC: 558M micro (not 615M), ~35M realistic threaded - MPSC: Added 15M/8.5M/5.3M for 2/4/8 producers - Latency: 20ns p50, 31ns p99 (not 30ns p50) - Burst: 385M (not 300M), 18% variance - Buffer optimal: 4096 slots (not 2048) Even though this directory is deprecated and redirects to tests/performance/, the documentation should still show accurate verified numbers to avoid confusion.
1 parent 5d478ff commit 1ba26df

File tree

2 files changed

+33
-17
lines changed

2 files changed

+33
-17
lines changed

benchmarks/README.md

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,14 @@ We've improved our benchmarking approach:
1616

1717
| Benchmark | What It Measures | Reference |
1818
|-----------|------------------|------------|
19-
| `benchmark_spsc_simple.nim` | Raw throughput (615M ops/sec) | Go channels |
20-
| `benchmark_latency.nim` | Latency distribution (30ns p50) | Tokio/Cassandra |
21-
| `benchmark_burst.nim` | Burst stability (300M ops/sec) | Redis |
22-
| `benchmark_sizes.nim` | Optimal buffer size (2048 slots) | LMAX Disruptor |
19+
| `benchmark_spsc_simple.nim` | Raw throughput (558M micro, ~35M realistic) | Go channels |
20+
| `benchmark_latency.nim` | Latency distribution (20ns p50, 31ns p99) | Tokio/Cassandra |
21+
| `benchmark_burst.nim` | Burst stability (385M ops/sec, 18% variance) | Redis |
22+
| `benchmark_sizes.nim` | Optimal buffer size (4096 slots, 557M ops/sec) | LMAX Disruptor |
2323
| `benchmark_stress.nim` | Maximum load (0% contention) | JMeter/Gatling |
2424
| `benchmark_sustained.nim` | Long-duration stability | Cassandra/ScyllaDB |
2525
| `benchmark_concurrent.nim` | Async overhead (512K ops/sec) | Async runtimes |
26+
| `benchmark_mpsc.nim` | MPSC performance (15M/8.5M/5.3M ops/sec) | JCTools MPSC |
2627

2728
## Quick Start
2829

@@ -49,16 +50,17 @@ Benchmarks run automatically on every commit:
4950

5051
**Latest benchmarks** (automated CI + local verification):
5152

52-
### Simple Single-Threaded Benchmark
53+
### Simple Single-Threaded Benchmark (SPSC)
5354
Location: `tests/performance/benchmark_spsc_simple.nim`
5455

5556
| Metric | Result |
5657
|--------|--------|
57-
| **Peak Throughput** | 600M+ ops/sec |
58-
| **Average Throughput** | 593M+ ops/sec |
59-
| **Latency** | ~1.7 ns/op |
58+
| **Peak Throughput (micro)** | 558M ops/sec |
59+
| **Average Throughput (micro)** | 551M ops/sec |
60+
| **Realistic Threaded** | ~35M ops/sec |
61+
| **Latency** | ~1.8 ns/op |
6062

61-
**What this measures**: Raw SPSC channel performance without threading or async overhead.
63+
**What this measures**: Raw SPSC channel performance. Micro-benchmark shows peak potential (tight loop), realistic threaded includes OS scheduling overhead.
6264

6365
### Concurrent Async Benchmark
6466
Location: `tests/performance/benchmark_concurrent.nim`
@@ -75,9 +77,14 @@ Location: `tests/performance/benchmark_concurrent.nim`
7577

7678
| Benchmark Type | Throughput | Use Case |
7779
|----------------|------------|----------|
78-
| **Simple (trySend/tryReceive)** | 600M+ ops/sec | Maximum performance, tight loops |
79-
| **Async (send/recv)** | 500K ops/sec | Convenience, async/await code |
80-
| **Multi-threaded** | 50M-200M ops/sec | Thread coordination overhead |
80+
| **SPSC micro (trySend/tryReceive)** | 558M ops/sec | Peak potential, tight loops |
81+
| **SPSC realistic threaded** | ~35M ops/sec | Actual multi-threaded workloads |
82+
| **MPSC (2 producers)** | 15M ops/sec | Multi-producer concurrent |
83+
| **MPSC (4 producers)** | 8.5M ops/sec | High concurrency |
84+
| **MPSC (8 producers)** | 5.3M ops/sec | Memory-bandwidth limited |
85+
| **Async (send/recv)** | 512K ops/sec | Convenience, async/await code |
86+
87+
**Key insight**: SPSC is 3.5× faster than MPSC in realistic threaded workloads (35M vs 10M ops/sec).
8188

8289
### 2. Stress Tests
8390

benchmarks/REPRODUCING.md

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,24 @@
66

77
## Latest Results (New Suite)
88

9-
**Comprehensive Benchmark Suite** - 7 industry-standard tests:
10-
- **Throughput**: 615M ops/sec peak
11-
- **Latency**: 30ns p50, 31ns p99
12-
- **Burst Load**: 300M ops/sec average, 21% variance
13-
- **Buffer Optimization**: 2048 slots optimal, 559M ops/sec
9+
**Comprehensive Benchmark Suite** - 8 industry-standard tests (verified in CI):
10+
11+
**SPSC Benchmarks:**
12+
- **Throughput (micro)**: 558M ops/sec peak, 551M average
13+
- **Throughput (realistic)**: ~35M ops/sec with thread scheduling
14+
- **Latency**: 20ns p50, 31ns p99, 50ns p99.9
15+
- **Burst Load**: 385M ops/sec average, 18% variance
16+
- **Buffer Optimization**: 4096 slots optimal, 557M ops/sec
1417
- **Stress Test**: 0% contention at 500K operations
1518
- **Sustained**: Stable performance over 10 seconds
1619
- **Async**: 512K ops/sec (shows async overhead)
1720

21+
**MPSC Benchmarks:**
22+
- **2 producers**: 15M ops/sec (optimal sweet spot)
23+
- **4 producers**: 8.5M ops/sec (good scalability)
24+
- **8 producers**: 5.3M ops/sec (memory-bandwidth limited)
25+
- **Key finding**: SPSC is 3.5× faster in realistic threaded workloads
26+
1827
## Why the New Suite?
1928

2029
The new benchmark suite follows industry best practices from:

0 commit comments

Comments
 (0)