|
1 | | -# ProtoMQ Benchmarks |
2 | | - |
3 | | -This directory contains the ProtoMQ benchmark suite for measuring performance across various scenarios. |
4 | | - |
5 | | -## Directory Structure |
6 | | - |
7 | | -``` |
8 | | -benchmarks/ |
9 | | -├── common/protomq_benchmarks/ # Shared benchmark library |
10 | | -│ ├── environment.py # System environment detection |
11 | | -│ ├── thresholds.py # Threshold validation |
12 | | -│ ├── metrics.py # Measurement utilities |
13 | | -│ └── runner.py # BenchmarkRunner |
14 | | -├── b1-baseline-concurrency/ # B1: Baseline concurrency test |
15 | | -│ ├── benchmark.py |
16 | | -│ ├── thresholds.json |
17 | | -│ └── README.md |
18 | | -├── results/ # All benchmark outputs (JSON) |
19 | | -└── benchmarks.md # Detailed benchmark plans (B1-B7) |
20 | | -``` |
21 | | - |
22 | | -## Running Benchmarks |
23 | | - |
24 | | -### Setup |
25 | | - |
26 | | -**One-time setup** (from benchmarks/ directory): |
27 | | -```bash |
28 | | -cd benchmarks |
29 | | -uv venv # Create virtual environment |
30 | | -uv pip install -e common/ # Install protomq_benchmarks library |
31 | | -uv pip install -e . # Install benchmarks package (creates console scripts) |
32 | | -``` |
33 | | - |
34 | | -This creates console scripts: |
35 | | -- `protomq-bench-b1` - Baseline concurrency benchmark |
36 | | -- `protomq-bench-b2` - Thundering herd benchmark |
37 | | - |
38 | | -### Running Benchmarks |
39 | | - |
40 | | -```bash |
41 | | -# Start server first |
42 | | -zig build run-server |
43 | | - |
44 | | -# Run benchmarks (from benchmarks/ directory with activated venv) |
45 | | -cd benchmarks |
46 | | -source .venv/bin/activate |
47 | | -protomq-bench-b1 |
48 | | -protomq-bench-b2 |
49 | | -``` |
50 | | - |
51 | | -Results are saved to `benchmarks/results/{commit_id}_{benchmark_name}.json` |
52 | | - |
53 | | -## Benchmark Library (`protomq_benchmarks`) |
54 | | - |
55 | | -### BenchmarkRunner |
56 | | - |
57 | | -Main interface for running benchmarks with automatic environment collection and threshold validation. |
58 | | - |
59 | | -```python |
60 | | -from protomq_benchmarks import BenchmarkRunner |
61 | | - |
62 | | -runner = BenchmarkRunner( |
63 | | - name="b1-baseline-concurrency", |
64 | | - version="1.0.0", |
65 | | - timeout_seconds=300 |
66 | | -) |
67 | | - |
68 | | -runner.register_thresholds_from_file("thresholds.json") |
69 | | - |
70 | | -@runner.benchmark |
71 | | -async def run_test(): |
72 | | - # Your benchmark logic |
73 | | - return {"metric1": value1, "metric2": value2} |
74 | | - |
75 | | -if __name__ == "__main__": |
76 | | - runner.run(output_dir="../results") |
77 | | -``` |
78 | | - |
79 | | -### Environment Detection |
80 | | - |
81 | | -Automatically collects: |
82 | | -- CPU model, architecture (normalized: aarch64 → arm64), cores, frequency |
83 | | -- RAM capacity |
84 | | -- Storage type and model (via `diskutil` on macOS, `/sys/block` on Linux) |
85 | | -- OS, kernel, Zig version, Python version |
86 | | -- Build mode (Release/Debug) |
87 | | -- ProtoMQ version and git commit hash |
88 | | -- Network backend (kqueue/epoll) |
89 | | - |
90 | | -### Threshold Management |
91 | | - |
92 | | -Define pass/warn/fail criteria with directional indicators: |
93 | | - |
94 | | -```json |
95 | | -{ |
96 | | - "p99_latency_ms": { |
97 | | - "direction": "lower", |
98 | | - "max": 5.0, |
99 | | - "warn": 1.0, |
100 | | - "description": "p99 latency threshold" |
101 | | - }, |
102 | | - "concurrent_connections": { |
103 | | - "direction": "higher", |
104 | | - "min": 100, |
105 | | - "description": "Must connect at least 100 clients" |
106 | | - } |
107 | | -} |
108 | | -``` |
109 | | - |
110 | | -- **`direction: "lower"`**: For metrics where lower is better (latency, memory) |
111 | | -- **`direction: "higher"`**: For metrics where higher is better (throughput, connections) |
112 | | - |
113 | | -### Metrics Utilities |
114 | | - |
115 | | -```python |
116 | | -from protomq_benchmarks import Timer, measure_memory |
117 | | -from protomq_benchmarks.metrics import LatencyStats |
118 | | - |
119 | | -# Measure time |
120 | | -with Timer() as t: |
121 | | - await some_operation() |
122 | | -print(f"Elapsed: {t.elapsed_ms()}ms") |
123 | | - |
124 | | -# Measure memory |
125 | | -memory_mb = measure_memory(server_pid) |
126 | | - |
127 | | -# Calculate latency statistics |
128 | | -stats = LatencyStats.from_measurements(latencies) |
129 | | -print(f"p99: {stats.p99:.3f}ms") |
130 | | -``` |
131 | | - |
132 | | -## Result Format |
133 | | - |
134 | | -Each benchmark produces a JSON file: `{commit_id}_{benchmark_name}.json` |
135 | | - |
136 | | -```json |
137 | | -{ |
138 | | - "benchmark": { |
139 | | - "name": "b1-baseline-concurrency", |
140 | | - "version": "1.0.0", |
141 | | - "timestamp": "2026-01-24T13:45:00Z", |
142 | | - "duration_s": 1.07 |
143 | | - }, |
144 | | - "environment": { |
145 | | - "hardware": {...}, |
146 | | - "software": {...}, |
147 | | - "protomq": {"commit_hash": "72144c15", ...} |
148 | | - }, |
149 | | - "metrics": { |
150 | | - "concurrent_connections": 100, |
151 | | - "p99_latency_ms": 0.432, |
152 | | - ... |
153 | | - }, |
154 | | - "thresholds": { |
155 | | - "passed": true, |
156 | | - "warnings": [], |
157 | | - "failures": [] |
158 | | - } |
159 | | -} |
160 | | -``` |
161 | | - |
162 | | -## Creating New Benchmarks |
163 | | - |
164 | | -1. Create directory: `benchmarks/bN-benchmark-name/` |
165 | | -2. Create `benchmark.py`: |
166 | | - ```python |
167 | | - from pathlib import Path |
168 | | - from protomq_benchmarks import BenchmarkRunner |
169 | | - |
170 | | - runner = BenchmarkRunner(name="bN-benchmark-name", timeout_seconds=600) |
171 | | - runner.register_thresholds_from_file(Path(__file__).parent / "thresholds.json") |
172 | | - |
173 | | - @runner.benchmark |
174 | | - async def run_test(): |
175 | | - # Your test logic |
176 | | - return {"metric": value} |
177 | | - |
178 | | - if __name__ == "__main__": |
179 | | - runner.run(output_dir=Path(__file__).parent.parent / "results") |
| 1 | +# ProtoMQ Benchmarking |
| 2 | + |
| 3 | +The main goal of the ProtoMQ project is to provide a high-performance MQTT server implementation using Zig with type-safety. To ensure that the server meets this goal, we perform regular benchmarking to measure its performance. It's recommended to run the "protomq-bench-b1" after each commit to detect any performance regressions, and all the benchmarks before a new release. |
| 4 | + |
| 5 | +## Regularly Testing Environments |
| 6 | + |
| 7 | +Mac OS: |
| 8 | +- **CPU**: Apple M2 Pro |
| 9 | +- **OS**: macOS 26.2 Darwin Kernel 25.2.0 (Using kqueue) |
| 10 | +- **Backend**: kqueue |
| 11 | +- **Zig Version**: 0.15.2 |
| 12 | + |
| 13 | +Linux: |
| 14 | +- **CPU**: ARM Cortex-A76 (Raspberry Pi 5) |
| 15 | +- **OS**: Debian 1:6.6.62-1+rpt1 (2024-11-25) aarch64 6.6.62+rpt-rpi-2712 |
| 16 | +- **Backend**: epoll |
| 17 | +- **Zig Version**: 0.15.2 |
| 18 | + |
| 19 | +## Results |
| 20 | + |
| 21 | +Whenever the benchmarks are run, they are saved under the "results" directory ("protomq/benchmarks/results") within a directory specific for the hardware. Furthermore, each results is saved as a JSON file with the name of the benchmark and the commit ID of the repository. The JSON holds the results for each metric defined and environment configuration. |
| 22 | + |
| 23 | +Please find the most recent results in the "results" directory with the name "latest" under the hardware directory. |
| 24 | + |
| 25 | +### Overall Summary (2026-01-25) |
| 26 | + |
| 27 | +| Test Scenario | Metric | Apple M2 Pro | Raspberry Pi 5 | |
| 28 | +|--------------|--------|--------------|----------------| |
| 29 | +| **100 concurrent connections** | p99 latency | 0.44 ms | 0.13 ms | |
| 30 | +| | Memory usage | 2.6 MB | 2.5 MB | |
| 31 | +| **10,000 concurrent clients** | Connection time | 0.96 s | 1.76 s | |
| 32 | +| | Message fan-out | 0.12 s | 0.21 s | |
| 33 | +| | Message loss | 0% | 0% | |
| 34 | +| **Sustained load (10 min)** | Throughput | 8,981 msg/s | 9,012 msg/s | |
| 35 | +| | Memory growth | 0.16 MB | 0.09 MB | |
| 36 | +| **Wildcard subscriptions** | Topic matching | 7.2 µs | 5.2 µs | |
| 37 | +| | 1000 subscribers | 100% correct | 100% correct | |
| 38 | +| **Connection churn** | Total connections | 100,000 | 100,000 | |
| 39 | +| | Connection rate | 1,496 conn/s | 1,548 conn/s | |
| 40 | +| | Memory leak | 0 MB | 0 MB | |
| 41 | +| **Message throughput** | 10 byte messages | 208k msg/s | 147k msg/s | |
| 42 | +| | 64 KB messages | 39k msg/s | 27k msg/s | |
| 43 | + |
| 44 | +**Notes:** |
| 45 | +- All tests run on loopback interface. |
| 46 | +- Server built with Zig 0.15.2, ReleaseSafe mode. |
| 47 | +- Raspberry Pi 5 shows competitive performance, especially in sustained throughput and topic matching. |
| 48 | + |
| 49 | +## Reproducing the Results |
| 50 | +1. Start the server: |
| 51 | + ```bash |
| 52 | + zig build -Doptimize=ReleaseSafe run-server |
| 53 | + ``` |
| 54 | +2. Create a virtual environment and install benchmark suite: |
| 55 | + ```bash |
| 56 | + python3 -m venv benchmarks/venv |
| 57 | + pip install -e ./common |
| 58 | + pip install -e . |
| 59 | + ``` |
| 60 | +3. Run any benchmark: |
| 61 | + ```bash |
| 62 | + source benchmarks/venv/bin/activate |
| 63 | + protomq-bench-b1 |
| 64 | + # protomq-bench-b2 |
| 65 | + # protomq-bench-b3 |
| 66 | + # ... |
180 | 67 | ``` |
181 | | -3. Create `thresholds.json` with metric criteria |
182 | | -4. Create `README.md` documenting the benchmark |
183 | | - |
184 | | -## Code Quality |
185 | | - |
186 | | -The benchmark suite uses `ruff` for linting and formatting (configured at project root): |
187 | | - |
188 | | -```bash |
189 | | -# Check code |
190 | | -ruff check benchmarks/ |
191 | | - |
192 | | -# Format code |
193 | | -ruff format benchmarks/ |
194 | | - |
195 | | -# Install pre-commit hooks |
196 | | -pre-commit install |
197 | | -``` |
198 | | - |
199 | | -All benchmarks must be PEP-8 compliant with: |
200 | | -- Module-level imports only (no `sys.path` hacks) |
201 | | -- Type hints where applicable |
202 | | -- Proper error handling |
203 | | -- No emojis in output (professional appearance) |
204 | | - |
205 | | -## Planned Benchmarks |
206 | | - |
207 | | -See `benchmarks.md` for detailed plans: |
208 | | - |
209 | | -- **B1**: Baseline Concurrency & Latency ✅ (implemented) |
210 | | -- **B2**: Thundering Herd (10k concurrent clients) |
211 | | -- **B3**: Sustained Throughput (10-minute stress test) |
212 | | -- **B4**: Wildcard Subscription Explosion |
213 | | -- **B5**: Protobuf Decoding Under Load |
214 | | -- **B6**: Connection Churn (rapid connect/disconnect) |
215 | | -- **B7**: Message Size Variations |
216 | | - |
217 | | -## CI/CD Integration |
218 | | - |
219 | | -Benchmarks can be integrated into CI/CD pipelines: |
220 | | - |
221 | | -```bash |
222 | | -# Run benchmark and check exit code |
223 | | -uv run b1-baseline-concurrency/benchmark.py |
224 | | -if [ $? -ne 0 ]; then |
225 | | - echo "Benchmark failed thresholds" |
226 | | - exit 1 |
227 | | -fi |
228 | | -``` |
229 | | - |
230 | | -Exit codes: |
231 | | -- `0`: All thresholds passed |
232 | | -- `1`: One or more thresholds failed or benchmark errored |
0 commit comments