aboutsummaryrefslogtreecommitdiffstats
path: root/benchmarks/BENCHMARK_SUMMARY.md
diff options
context:
space:
mode:
authorClawd <ai@clawd.bot>2026-02-20 19:59:03 -0800
committerClawd <ai@clawd.bot>2026-02-20 19:59:03 -0800
commit97e6a3cfb67d18273fa88ce086b3db514c7e3083 (patch)
tree3b0328494b1dc32328af10500a10d0a2d7f5b944 /benchmarks/BENCHMARK_SUMMARY.md
parentec724b0b6e04a7c3078ad0bc10e1a683d1baaa9d (diff)
benchmarks: update for v0.3.0 zero-dep implementation
The crypto is brutally slow (~400x for signing, ~8500x for key gen). Non-crypto operations remain competitive. Updated summary to be honest about the tradeoffs.
Diffstat (limited to 'benchmarks/BENCHMARK_SUMMARY.md')
-rw-r--r--benchmarks/BENCHMARK_SUMMARY.md182
1 files changed, 56 insertions, 126 deletions
diff --git a/benchmarks/BENCHMARK_SUMMARY.md b/benchmarks/BENCHMARK_SUMMARY.md
index 2e087a3..aa0d771 100644
--- a/benchmarks/BENCHMARK_SUMMARY.md
+++ b/benchmarks/BENCHMARK_SUMMARY.md
@@ -1,154 +1,84 @@
1# Benchmark Results Summary 1# Benchmark Results Summary
2 2
3Comparison of three Go Nostr libraries: **NWIO** (code.northwest.io/nostr), **NBD** (github.com/nbd-wtf/go-nostr), and **Fiat** (fiatjaf.com/nostr) 3Comparison of three Go Nostr libraries: **NWIO** (code.northwest.io/nostr v0.3.0), **NBD** (github.com/nbd-wtf/go-nostr), and **Fiat** (fiatjaf.com/nostr)
4 4
5## Quick Performance Overview 5## The Honest Truth
6 6
7### 🏆 Winners by Category 7NWIO v0.3.0 uses a pure-Go secp256k1 implementation with zero external dependencies. This makes the crypto **dramatically slower** than libraries using btcec:
8 8
9| Operation | Winner | Performance | 9| Operation | NWIO | NBD | Fiat | NWIO Slowdown |
10|-----------|--------|-------------| 10|-----------|------|-----|------|---------------|
11| **Event Unmarshal** | NWIO/Fiat | ~2.5 µs (tied) | 11| **Key Gen** | 22.6 ms | 2.6 µs | 49.7 µs | ~8500x slower |
12| **Event Marshal** | NWIO | 1.79 µs (fastest, least memory) | 12| **Sign** | 49.4 ms | 122 µs | 121 µs | ~400x slower |
13| **Event Serialize** | NBD | 129 ns (3x faster than NWIO) | 13| **Verify** | 54.5 ms | 199 µs | 198 µs | ~274x slower |
14| **Compute ID** | Fiat | 276 ns (2x faster than NWIO) |
15| **Generate Key** | NBD | 470 ns (80x faster!) |
16| **Event Sign** | NBD/Fiat | ~59 µs (2x faster than NWIO) |
17| **Event Verify** | NWIO | 99.7 µs (slightly faster) |
18| **Filter Match** | NWIO | 7.1 ns (2x faster than Fiat) |
19| **Filter Complex** | NWIO | 30.9 ns (fastest) |
20 14
21## Detailed Results 15That's not a typo. The pure `big.Int` implementation is brutal.
22
23### Event Unmarshaling (JSON → Event)
24```
25NWIO: 2,541 ns/op 888 B/op 17 allocs/op ⭐ FASTEST, LOW MEMORY
26NBD: 2,832 ns/op 944 B/op 13 allocs/op
27Fiat: 2,545 ns/op 752 B/op 10 allocs/op ⭐ LEAST MEMORY
28```
29**Analysis**: All three are very competitive. NWIO and Fiat are effectively tied. Fiat uses least memory.
30
31### Event Marshaling (Event → JSON)
32```
33NWIO: 1,790 ns/op 1,010 B/op 3 allocs/op ⭐ FASTEST, LEAST ALLOCS
34NBD: 1,819 ns/op 1,500 B/op 6 allocs/op
35Fiat: 1,971 ns/op 2,254 B/op 13 allocs/op
36```
37**Analysis**: NWIO is fastest with minimal allocations. Significant memory advantage over competitors.
38
39### Event Serialization (for ID computation)
40```
41NWIO: 391 ns/op 360 B/op 7 allocs/op
42NBD: 129 ns/op 208 B/op 2 allocs/op ⭐ FASTEST, 3x faster
43Fiat: 161 ns/op 400 B/op 3 allocs/op
44```
45**Analysis**: NBD dominates here with optimized serialization. NWIO has room for improvement.
46 16
47### Event ID Computation 17## Where NWIO Wins
48```
49NWIO: 608 ns/op 488 B/op 9 allocs/op
50NBD: 302 ns/op 336 B/op 4 allocs/op
51Fiat: 276 ns/op 400 B/op 3 allocs/op ⭐ FASTEST
52```
53**Analysis**: NBD and Fiat are 2x faster. NWIO should optimize ID computation path.
54 18
55### Key Generation 19Non-crypto operations are competitive or fastest:
56```
57NWIO: 37,689 ns/op 208 B/op 4 allocs/op
58NBD: 470 ns/op 369 B/op 8 allocs/op ⭐ FASTEST, 80x faster!
59Fiat: 25,375 ns/op 272 B/op 5 allocs/op
60```
61**Analysis**: ⚠️ NWIO is significantly slower. NBD appears to use a different key generation strategy. This is the biggest performance gap.
62 20
63### Event Signing 21| Operation | NWIO | NBD | Fiat | Winner |
64``` 22|-----------|------|-----|------|--------|
65NWIO: 129,854 ns/op 2,363 B/op 42 allocs/op 23| **Event Marshal** | 8.9 µs | 12.3 µs | 13.0 µs | NWIO ⭐ |
66NBD: 59,069 ns/op 2,112 B/op 35 allocs/op ⭐ TIED FASTEST 24| **Event Unmarshal** | 10.4 µs | 11.0 µs | 8.5 µs | Fiat |
67Fiat: 58,572 ns/op 1,760 B/op 29 allocs/op ⭐ LEAST MEMORY 25| **Filter Match** | 14.3 ns | 22.3 ns | 38.5 ns | NWIO ⭐ |
68``` 26| **Filter Complex** | 66.4 ns | 74.2 ns | 92.8 ns | NWIO ⭐ |
69**Analysis**: NBD and Fiat are 2x faster. NWIO has more allocations in signing path.
70 27
71### Event Verification 28## Detailed Results
72```
73NWIO: 99,744 ns/op 953 B/op 19 allocs/op ⭐ FASTEST
74NBD: 105,995 ns/op 624 B/op 11 allocs/op ⭐ LEAST MEMORY
75Fiat: 103,744 ns/op 640 B/op 9 allocs/op
76```
77**Analysis**: NWIO is slightly faster (6% faster than others). Very competitive across all three.
78 29
79### Filter Matching (Simple)
80```
81NWIO: 7.1 ns/op 0 B/op 0 allocs/op ⭐ FASTEST, 2x faster
82NBD: 10.8 ns/op 0 B/op 0 allocs/op
83Fiat: 16.4 ns/op 0 B/op 0 allocs/op
84``` 30```
85**Analysis**: NWIO excels at filter matching! Zero allocations across all libraries. 31goos: linux
32goarch: amd64
33cpu: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
86 34
87### Filter Matching (Complex with Tags) 35BenchmarkEventUnmarshal_NWIO-8 163561 10400 ns/op 888 B/op 17 allocs/op
88``` 36BenchmarkEventUnmarshal_NBD-8 94100 10985 ns/op 944 B/op 13 allocs/op
89NWIO: 30.9 ns/op 0 B/op 0 allocs/op ⭐ FASTEST 37BenchmarkEventUnmarshal_Fiat-8 128083 8498 ns/op 752 B/op 10 allocs/op
90NBD: 33.4 ns/op 0 B/op 0 allocs/op
91Fiat: 42.6 ns/op 0 B/op 0 allocs/op
92```
93**Analysis**: NWIO maintains lead in complex filters. Important for relay implementations.
94 38
95## Optimization Opportunities for NWIO 39BenchmarkEventMarshal_NWIO-8 136460 8857 ns/op 1009 B/op 3 allocs/op
40BenchmarkEventMarshal_NBD-8 87494 12326 ns/op 1497 B/op 6 allocs/op
41BenchmarkEventMarshal_Fiat-8 203302 13049 ns/op 2250 B/op 13 allocs/op
96 42
97### High Priority 🔴 43BenchmarkEventSerialize_NWIO-8 451017 2967 ns/op 360 B/op 7 allocs/op
981. **Key Generation** - 80x slower than NBD 44BenchmarkEventSerialize_NBD-8 1265192 1066 ns/op 208 B/op 2 allocs/op
99 - Current: 37.7 µs 45BenchmarkEventSerialize_Fiat-8 907284 1378 ns/op 400 B/op 3 allocs/op
100 - Target: ~500 ns (similar to NBD)
101 - Impact: Critical for client applications
102 46
1032. **Event Signing** - 2x slower than competitors 47BenchmarkComputeID_NWIO-8 250470 4425 ns/op 488 B/op 9 allocs/op
104 - Current: 130 µs 48BenchmarkComputeID_NBD-8 391370 3062 ns/op 336 B/op 4 allocs/op
105 - Target: ~60 µs (match NBD/Fiat) 49BenchmarkComputeID_Fiat-8 339658 3677 ns/op 400 B/op 3 allocs/op
106 - Impact: High for client applications
107 50
108### Medium Priority 🟡 51BenchmarkGenerateKey_NWIO-8 46 22596880 ns/op 1682613 B/op 27259 allocs/op
1093. **Event Serialization** - 3x slower than NBD 52BenchmarkGenerateKey_NBD-8 857683 2643 ns/op 368 B/op 8 allocs/op
110 - Current: 391 ns 53BenchmarkGenerateKey_Fiat-8 23874 49726 ns/op 272 B/op 5 allocs/op
111 - Target: ~130 ns (match NBD)
112 - Impact: Used in ID computation
113 54
1144. **ID Computation** - 2x slower than competitors 55BenchmarkEventSign_NWIO-8 38 49364702 ns/op 3403147 B/op 55099 allocs/op
115 - Current: 608 ns 56BenchmarkEventSign_NBD-8 9332 122518 ns/op 2112 B/op 35 allocs/op
116 - Target: ~280 ns (match Fiat) 57BenchmarkEventSign_Fiat-8 8274 121756 ns/op 1760 B/op 29 allocs/op
117 - Impact: Affects every event processing
118 58
119## Current Strengths of NWIO ✅ 59BenchmarkEventVerify_NWIO-8 26 54485034 ns/op 3310792 B/op 53635 allocs/op
60BenchmarkEventVerify_NBD-8 5815 199061 ns/op 624 B/op 11 allocs/op
61BenchmarkEventVerify_Fiat-8 5856 198714 ns/op 640 B/op 9 allocs/op
120 62
1211. **Filter Matching** - 2x faster than Fiat, fastest overall 63BenchmarkFilterMatch_NWIO-8 81765290 14.34 ns/op 0 B/op 0 allocs/op
1222. **Event Marshaling** - Fastest with minimal allocations 64BenchmarkFilterMatch_NBD-8 53242167 22.26 ns/op 0 B/op 0 allocs/op
1233. **Event Verification** - Slightly faster than competitors 65BenchmarkFilterMatch_Fiat-8 30670489 38.53 ns/op 0 B/op 0 allocs/op
1244. **Memory Efficiency** - Competitive or better in most operations
125 66
126## Recommendations 67BenchmarkFilterMatchComplex_NWIO-8 17972340 66.38 ns/op 0 B/op 0 allocs/op
68BenchmarkFilterMatchComplex_NBD-8 14769445 74.21 ns/op 0 B/op 0 allocs/op
69BenchmarkFilterMatchComplex_Fiat-8 12921300 92.83 ns/op 0 B/op 0 allocs/op
70```
127 71
128### For Relay Implementations 72## Should You Use NWIO?
129- **NWIO excels**: Best filter matching performance
130- All three are competitive for event parsing/verification
131 73
132### For Client Implementations 74**For learning/reading code:** Yes. Zero dependencies, everything is auditable.
133- **NBD/Fiat preferred**: Much faster key generation and signing
134- NWIO needs optimization in crypto operations
135 75
136### Overall Assessment 76**For a side project:** Maybe. 50ms to sign an event is noticeable but tolerable if you're not signing constantly.
137- **NWIO**: Best for relay/filter-heavy workloads
138- **NBD**: Most balanced, excellent crypto performance
139- **Fiat**: Good all-around, lowest memory in some operations
140 77
141## Running Your Own Benchmarks 78**For anything serious:** No. Use NBD or Fiat. The crypto performance gap is too large.
142 79
143```bash 80## Why So Slow?
144# Run all benchmarks
145./run_benchmarks.sh
146 81
147# Compare specific operations 82The internal secp256k1 implementation uses Go's `math/big` for arbitrary-precision arithmetic. Every operation allocates, nothing is constant-time, and there's no assembly optimization. Production libraries like btcec use fixed-width limbs, stack allocation, and hand-tuned assembly.
148go test -bench=BenchmarkEventSign -benchmem comparison_bench_test.go
149go test -bench=BenchmarkFilterMatch -benchmem comparison_bench_test.go
150 83
151# Statistical analysis with benchstat 84This is the price of zero dependencies and readable code.
152go test -bench=. -count=10 comparison_bench_test.go > results.txt
153benchstat results.txt
154```