benchmarks: update for v0.3.0 zero-dep implementation

The crypto is brutally slow (~400x for signing, ~8500x for key gen). Non-crypto operations remain competitive. Updated summary to be honest about the tradeoffs.
author: Clawd <ai@clawd.bot> 2026-02-20 19:59:03 -0800
committer: Clawd <ai@clawd.bot> 2026-02-20 19:59:03 -0800
commit: 97e6a3cfb67d18273fa88ce086b3db514c7e3083 (patch)
tree: 3b0328494b1dc32328af10500a10d0a2d7f5b944 /benchmarks/BENCHMARK_SUMMARY.md
parent: ec724b0b6e04a7c3078ad0bc10e1a683d1baaa9d (diff)
1 files changed, 56 insertions, 126 deletions
diff --git a/benchmarks/BENCHMARK_SUMMARY.md b/benchmarks/BENCHMARK_SUMMARY.md
index 2e087a3..aa0d771 100644
--- a/benchmarks/BENCHMARK_SUMMARY.md
+++ b/benchmarks/BENCHMARK_SUMMARY.md
@@ -1,154 +1,84 @@
 # Benchmark Results Summary
-Comparison of three Go Nostr libraries: **NWIO** (code.northwest.io/nostr), **NBD** (github.com/nbd-wtf/go-nostr), and **Fiat** (fiatjaf.com/nostr)
+Comparison of three Go Nostr libraries: **NWIO** (code.northwest.io/nostr v0.3.0), **NBD** (github.com/nbd-wtf/go-nostr), and **Fiat** (fiatjaf.com/nostr)
-## Quick Performance Overview
+## The Honest Truth
-### 🏆 Winners by Category
+NWIO v0.3.0 uses a pure-Go secp256k1 implementation with zero external dependencies. This makes the crypto **dramatically slower** than libraries using btcec:
-| Operation | Winner | Performance |
+| Operation | NWIO | NBD | Fiat | NWIO Slowdown |
-|-----------|--------|-------------|
+|-----------|------|-----|------|---------------|
-| **Event Unmarshal** | NWIO/Fiat | ~2.5 µs (tied) |
+| **Key Gen** | 22.6 ms | 2.6 µs | 49.7 µs | ~8500x slower |
-| **Event Marshal** | NWIO | 1.79 µs (fastest, least memory) |
+| **Sign** | 49.4 ms | 122 µs | 121 µs | ~400x slower |
-| **Event Serialize** | NBD | 129 ns (3x faster than NWIO) |
+| **Verify** | 54.5 ms | 199 µs | 198 µs | ~274x slower |
-| **Compute ID** | Fiat | 276 ns (2x faster than NWIO) |
-| **Generate Key** | NBD | 470 ns (80x faster!) |
-| **Event Sign** | NBD/Fiat | ~59 µs (2x faster than NWIO) |
-| **Event Verify** | NWIO | 99.7 µs (slightly faster) |
-| **Filter Match** | NWIO | 7.1 ns (2x faster than Fiat) |
-| **Filter Complex** | NWIO | 30.9 ns (fastest) |
-## Detailed Results
+That's not a typo. The pure `big.Int` implementation is brutal.
-### Event Unmarshaling (JSON → Event)
-```
-NWIO:  2,541 ns/op    888 B/op    17 allocs/op  ⭐ FASTEST, LOW MEMORY
-NBD:   2,832 ns/op    944 B/op    13 allocs/op
-Fiat:  2,545 ns/op    752 B/op    10 allocs/op  ⭐ LEAST MEMORY
-```
-**Analysis**: All three are very competitive. NWIO and Fiat are effectively tied. Fiat uses least memory.
-### Event Marshaling (Event → JSON)
-```
-NWIO:  1,790 ns/op  1,010 B/op     3 allocs/op  ⭐ FASTEST, LEAST ALLOCS
-NBD:   1,819 ns/op  1,500 B/op     6 allocs/op
-Fiat:  1,971 ns/op  2,254 B/op    13 allocs/op
-```
-**Analysis**: NWIO is fastest with minimal allocations. Significant memory advantage over competitors.
-### Event Serialization (for ID computation)
-```
-NWIO:  391 ns/op    360 B/op    7 allocs/op
-NBD:   129 ns/op    208 B/op    2 allocs/op  ⭐ FASTEST, 3x faster
-Fiat:  161 ns/op    400 B/op    3 allocs/op
-```
-**Analysis**: NBD dominates here with optimized serialization. NWIO has room for improvement.
-### Event ID Computation
+## Where NWIO Wins
-```
-NWIO:  608 ns/op    488 B/op    9 allocs/op
-NBD:   302 ns/op    336 B/op    4 allocs/op
-Fiat:  276 ns/op    400 B/op    3 allocs/op  ⭐ FASTEST
-```
-**Analysis**: NBD and Fiat are 2x faster. NWIO should optimize ID computation path.
-### Key Generation
+Non-crypto operations are competitive or fastest:
-```
-NWIO:  37,689 ns/op    208 B/op    4 allocs/op
-NBD:     470 ns/op    369 B/op    8 allocs/op  ⭐ FASTEST, 80x faster!
-Fiat:  25,375 ns/op    272 B/op    5 allocs/op
-```
-**Analysis**: ⚠️ NWIO is significantly slower. NBD appears to use a different key generation strategy. This is the biggest performance gap.
-### Event Signing
+| Operation | NWIO | NBD | Fiat | Winner |
-```
+|-----------|------|-----|------|--------|
-NWIO:  129,854 ns/op  2,363 B/op   42 allocs/op
+| **Event Marshal** | 8.9 µs | 12.3 µs | 13.0 µs | NWIO ⭐ |
-NBD:    59,069 ns/op  2,112 B/op   35 allocs/op  ⭐ TIED FASTEST
+| **Event Unmarshal** | 10.4 µs | 11.0 µs | 8.5 µs | Fiat |
-Fiat:   58,572 ns/op  1,760 B/op   29 allocs/op  ⭐ LEAST MEMORY
+| **Filter Match** | 14.3 ns | 22.3 ns | 38.5 ns | NWIO ⭐ |
-```
+| **Filter Complex** | 66.4 ns | 74.2 ns | 92.8 ns | NWIO ⭐ |
-**Analysis**: NBD and Fiat are 2x faster. NWIO has more allocations in signing path.
-### Event Verification
+## Detailed Results
-```
-NWIO:   99,744 ns/op    953 B/op   19 allocs/op  ⭐ FASTEST
-NBD:   105,995 ns/op    624 B/op   11 allocs/op  ⭐ LEAST MEMORY
-Fiat:  103,744 ns/op    640 B/op    9 allocs/op
-```
-**Analysis**: NWIO is slightly faster (6% faster than others). Very competitive across all three.
-### Filter Matching (Simple)
-```
-NWIO:  7.1 ns/op    0 B/op    0 allocs/op  ⭐ FASTEST, 2x faster
-NBD:  10.8 ns/op    0 B/op    0 allocs/op
-Fiat: 16.4 ns/op    0 B/op    0 allocs/op
 ```
-**Analysis**: NWIO excels at filter matching! Zero allocations across all libraries.
+goos: linux
+goarch: amd64
+cpu: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
-### Filter Matching (Complex with Tags)
+BenchmarkEventUnmarshal_NWIO-8        163561     10400 ns/op      888 B/op    17 allocs/op
-```
+BenchmarkEventUnmarshal_NBD-8          94100     10985 ns/op      944 B/op    13 allocs/op
-NWIO:  30.9 ns/op    0 B/op    0 allocs/op  ⭐ FASTEST
+BenchmarkEventUnmarshal_Fiat-8        128083      8498 ns/op      752 B/op    10 allocs/op
-NBD:   33.4 ns/op    0 B/op    0 allocs/op
-Fiat:  42.6 ns/op    0 B/op    0 allocs/op
-```
-**Analysis**: NWIO maintains lead in complex filters. Important for relay implementations.
-## Optimization Opportunities for NWIO
+BenchmarkEventMarshal_NWIO-8          136460      8857 ns/op     1009 B/op     3 allocs/op
+BenchmarkEventMarshal_NBD-8            87494     12326 ns/op     1497 B/op     6 allocs/op
+BenchmarkEventMarshal_Fiat-8          203302     13049 ns/op     2250 B/op    13 allocs/op
-### High Priority 🔴
+BenchmarkEventSerialize_NWIO-8        451017      2967 ns/op      360 B/op     7 allocs/op
-1. **Key Generation** - 80x slower than NBD
+BenchmarkEventSerialize_NBD-8        1265192      1066 ns/op      208 B/op     2 allocs/op
-   - Current: 37.7 µs
+BenchmarkEventSerialize_Fiat-8        907284      1378 ns/op      400 B/op     3 allocs/op
-   - Target: ~500 ns (similar to NBD)
-   - Impact: Critical for client applications
-2. **Event Signing** - 2x slower than competitors
+BenchmarkComputeID_NWIO-8             250470      4425 ns/op      488 B/op     9 allocs/op
-   - Current: 130 µs
+BenchmarkComputeID_NBD-8              391370      3062 ns/op      336 B/op     4 allocs/op
-   - Target: ~60 µs (match NBD/Fiat)
+BenchmarkComputeID_Fiat-8             339658      3677 ns/op      400 B/op     3 allocs/op
-   - Impact: High for client applications
-### Medium Priority 🟡
+BenchmarkGenerateKey_NWIO-8               46  22596880 ns/op  1682613 B/op 27259 allocs/op
-3. **Event Serialization** - 3x slower than NBD
+BenchmarkGenerateKey_NBD-8            857683      2643 ns/op      368 B/op     8 allocs/op
-   - Current: 391 ns
+BenchmarkGenerateKey_Fiat-8            23874     49726 ns/op      272 B/op     5 allocs/op
-   - Target: ~130 ns (match NBD)
-   - Impact: Used in ID computation
-4. **ID Computation** - 2x slower than competitors
+BenchmarkEventSign_NWIO-8                 38  49364702 ns/op  3403147 B/op 55099 allocs/op
-   - Current: 608 ns
+BenchmarkEventSign_NBD-8                9332    122518 ns/op     2112 B/op    35 allocs/op
-   - Target: ~280 ns (match Fiat)
+BenchmarkEventSign_Fiat-8               8274    121756 ns/op     1760 B/op    29 allocs/op
-   - Impact: Affects every event processing
-## Current Strengths of NWIO ✅
+BenchmarkEventVerify_NWIO-8               26  54485034 ns/op  3310792 B/op 53635 allocs/op
+BenchmarkEventVerify_NBD-8              5815    199061 ns/op      624 B/op    11 allocs/op
+BenchmarkEventVerify_Fiat-8             5856    198714 ns/op      640 B/op     9 allocs/op
-1. **Filter Matching** - 2x faster than Fiat, fastest overall
+BenchmarkFilterMatch_NWIO-8         81765290     14.34 ns/op        0 B/op     0 allocs/op
-2. **Event Marshaling** - Fastest with minimal allocations
+BenchmarkFilterMatch_NBD-8          53242167     22.26 ns/op        0 B/op     0 allocs/op
-3. **Event Verification** - Slightly faster than competitors
+BenchmarkFilterMatch_Fiat-8         30670489     38.53 ns/op        0 B/op     0 allocs/op
-4. **Memory Efficiency** - Competitive or better in most operations
-## Recommendations
+BenchmarkFilterMatchComplex_NWIO-8  17972340     66.38 ns/op        0 B/op     0 allocs/op
+BenchmarkFilterMatchComplex_NBD-8   14769445     74.21 ns/op        0 B/op     0 allocs/op
+BenchmarkFilterMatchComplex_Fiat-8  12921300     92.83 ns/op        0 B/op     0 allocs/op
+```
-### For Relay Implementations
+## Should You Use NWIO?
- **NWIO excels**: Best filter matching performance
- All three are competitive for event parsing/verification
-### For Client Implementations
+**For learning/reading code:** Yes. Zero dependencies, everything is auditable.
- **NBD/Fiat preferred**: Much faster key generation and signing
- NWIO needs optimization in crypto operations
-### Overall Assessment
+**For a side project:** Maybe. 50ms to sign an event is noticeable but tolerable if you're not signing constantly.
- **NWIO**: Best for relay/filter-heavy workloads
- **NBD**: Most balanced, excellent crypto performance
- **Fiat**: Good all-around, lowest memory in some operations
-## Running Your Own Benchmarks
+**For anything serious:** No. Use NBD or Fiat. The crypto performance gap is too large.
-```bash
+## Why So Slow?
-# Run all benchmarks
-./run_benchmarks.sh
-# Compare specific operations
+The internal secp256k1 implementation uses Go's `math/big` for arbitrary-precision arithmetic. Every operation allocates, nothing is constant-time, and there's no assembly optimization. Production libraries like btcec use fixed-width limbs, stack allocation, and hand-tuned assembly.
-go test -bench=BenchmarkEventSign -benchmem comparison_bench_test.go
-go test -bench=BenchmarkFilterMatch -benchmem comparison_bench_test.go
-# Statistical analysis with benchstat
+This is the price of zero dependencies and readable code.
-go test -bench=. -count=10 comparison_bench_test.go > results.txt
-benchstat results.txt
-```
author	Clawd <ai@clawd.bot>	2026-02-20 19:59:03 -0800
committer	Clawd <ai@clawd.bot>	2026-02-20 19:59:03 -0800
commit	97e6a3cfb67d18273fa88ce086b3db514c7e3083 (patch)
tree	3b0328494b1dc32328af10500a10d0a2d7f5b944 /benchmarks/BENCHMARK_SUMMARY.md
parent	ec724b0b6e04a7c3078ad0bc10e1a683d1baaa9d (diff)