From 97e6a3cfb67d18273fa88ce086b3db514c7e3083 Mon Sep 17 00:00:00 2001 From: Clawd Date: Fri, 20 Feb 2026 19:59:03 -0800 Subject: benchmarks: update for v0.3.0 zero-dep implementation The crypto is brutally slow (~400x for signing, ~8500x for key gen). Non-crypto operations remain competitive. Updated summary to be honest about the tradeoffs. --- benchmarks/BENCHMARK_SUMMARY.md | 182 ++++++++++++--------------------------- benchmarks/benchmark_results.txt | 59 ++++++------- benchmarks/comparison/go.mod | 4 +- benchmarks/comparison/go.sum | 2 + 4 files changed, 89 insertions(+), 158 deletions(-) (limited to 'benchmarks') diff --git a/benchmarks/BENCHMARK_SUMMARY.md b/benchmarks/BENCHMARK_SUMMARY.md index 2e087a3..aa0d771 100644 --- a/benchmarks/BENCHMARK_SUMMARY.md +++ b/benchmarks/BENCHMARK_SUMMARY.md @@ -1,154 +1,84 @@ # Benchmark Results Summary -Comparison of three Go Nostr libraries: **NWIO** (code.northwest.io/nostr), **NBD** (github.com/nbd-wtf/go-nostr), and **Fiat** (fiatjaf.com/nostr) +Comparison of three Go Nostr libraries: **NWIO** (code.northwest.io/nostr v0.3.0), **NBD** (github.com/nbd-wtf/go-nostr), and **Fiat** (fiatjaf.com/nostr) -## Quick Performance Overview +## The Honest Truth -### 🏆 Winners by Category +NWIO v0.3.0 uses a pure-Go secp256k1 implementation with zero external dependencies. This makes the crypto **dramatically slower** than libraries using btcec: -| Operation | Winner | Performance | -|-----------|--------|-------------| -| **Event Unmarshal** | NWIO/Fiat | ~2.5 µs (tied) | -| **Event Marshal** | NWIO | 1.79 µs (fastest, least memory) | -| **Event Serialize** | NBD | 129 ns (3x faster than NWIO) | -| **Compute ID** | Fiat | 276 ns (2x faster than NWIO) | -| **Generate Key** | NBD | 470 ns (80x faster!) | -| **Event Sign** | NBD/Fiat | ~59 µs (2x faster than NWIO) | -| **Event Verify** | NWIO | 99.7 µs (slightly faster) | -| **Filter Match** | NWIO | 7.1 ns (2x faster than Fiat) | -| **Filter Complex** | NWIO | 30.9 ns (fastest) | +| Operation | NWIO | NBD | Fiat | NWIO Slowdown | +|-----------|------|-----|------|---------------| +| **Key Gen** | 22.6 ms | 2.6 µs | 49.7 µs | ~8500x slower | +| **Sign** | 49.4 ms | 122 µs | 121 µs | ~400x slower | +| **Verify** | 54.5 ms | 199 µs | 198 µs | ~274x slower | -## Detailed Results - -### Event Unmarshaling (JSON → Event) -``` -NWIO: 2,541 ns/op 888 B/op 17 allocs/op ⭐ FASTEST, LOW MEMORY -NBD: 2,832 ns/op 944 B/op 13 allocs/op -Fiat: 2,545 ns/op 752 B/op 10 allocs/op ⭐ LEAST MEMORY -``` -**Analysis**: All three are very competitive. NWIO and Fiat are effectively tied. Fiat uses least memory. - -### Event Marshaling (Event → JSON) -``` -NWIO: 1,790 ns/op 1,010 B/op 3 allocs/op ⭐ FASTEST, LEAST ALLOCS -NBD: 1,819 ns/op 1,500 B/op 6 allocs/op -Fiat: 1,971 ns/op 2,254 B/op 13 allocs/op -``` -**Analysis**: NWIO is fastest with minimal allocations. Significant memory advantage over competitors. - -### Event Serialization (for ID computation) -``` -NWIO: 391 ns/op 360 B/op 7 allocs/op -NBD: 129 ns/op 208 B/op 2 allocs/op ⭐ FASTEST, 3x faster -Fiat: 161 ns/op 400 B/op 3 allocs/op -``` -**Analysis**: NBD dominates here with optimized serialization. NWIO has room for improvement. +That's not a typo. The pure `big.Int` implementation is brutal. -### Event ID Computation -``` -NWIO: 608 ns/op 488 B/op 9 allocs/op -NBD: 302 ns/op 336 B/op 4 allocs/op -Fiat: 276 ns/op 400 B/op 3 allocs/op ⭐ FASTEST -``` -**Analysis**: NBD and Fiat are 2x faster. NWIO should optimize ID computation path. +## Where NWIO Wins -### Key Generation -``` -NWIO: 37,689 ns/op 208 B/op 4 allocs/op -NBD: 470 ns/op 369 B/op 8 allocs/op ⭐ FASTEST, 80x faster! -Fiat: 25,375 ns/op 272 B/op 5 allocs/op -``` -**Analysis**: ⚠️ NWIO is significantly slower. NBD appears to use a different key generation strategy. This is the biggest performance gap. +Non-crypto operations are competitive or fastest: -### Event Signing -``` -NWIO: 129,854 ns/op 2,363 B/op 42 allocs/op -NBD: 59,069 ns/op 2,112 B/op 35 allocs/op ⭐ TIED FASTEST -Fiat: 58,572 ns/op 1,760 B/op 29 allocs/op ⭐ LEAST MEMORY -``` -**Analysis**: NBD and Fiat are 2x faster. NWIO has more allocations in signing path. +| Operation | NWIO | NBD | Fiat | Winner | +|-----------|------|-----|------|--------| +| **Event Marshal** | 8.9 µs | 12.3 µs | 13.0 µs | NWIO ⭐ | +| **Event Unmarshal** | 10.4 µs | 11.0 µs | 8.5 µs | Fiat | +| **Filter Match** | 14.3 ns | 22.3 ns | 38.5 ns | NWIO ⭐ | +| **Filter Complex** | 66.4 ns | 74.2 ns | 92.8 ns | NWIO ⭐ | -### Event Verification -``` -NWIO: 99,744 ns/op 953 B/op 19 allocs/op ⭐ FASTEST -NBD: 105,995 ns/op 624 B/op 11 allocs/op ⭐ LEAST MEMORY -Fiat: 103,744 ns/op 640 B/op 9 allocs/op -``` -**Analysis**: NWIO is slightly faster (6% faster than others). Very competitive across all three. +## Detailed Results -### Filter Matching (Simple) -``` -NWIO: 7.1 ns/op 0 B/op 0 allocs/op ⭐ FASTEST, 2x faster -NBD: 10.8 ns/op 0 B/op 0 allocs/op -Fiat: 16.4 ns/op 0 B/op 0 allocs/op ``` -**Analysis**: NWIO excels at filter matching! Zero allocations across all libraries. +goos: linux +goarch: amd64 +cpu: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz -### Filter Matching (Complex with Tags) -``` -NWIO: 30.9 ns/op 0 B/op 0 allocs/op ⭐ FASTEST -NBD: 33.4 ns/op 0 B/op 0 allocs/op -Fiat: 42.6 ns/op 0 B/op 0 allocs/op -``` -**Analysis**: NWIO maintains lead in complex filters. Important for relay implementations. +BenchmarkEventUnmarshal_NWIO-8 163561 10400 ns/op 888 B/op 17 allocs/op +BenchmarkEventUnmarshal_NBD-8 94100 10985 ns/op 944 B/op 13 allocs/op +BenchmarkEventUnmarshal_Fiat-8 128083 8498 ns/op 752 B/op 10 allocs/op -## Optimization Opportunities for NWIO +BenchmarkEventMarshal_NWIO-8 136460 8857 ns/op 1009 B/op 3 allocs/op +BenchmarkEventMarshal_NBD-8 87494 12326 ns/op 1497 B/op 6 allocs/op +BenchmarkEventMarshal_Fiat-8 203302 13049 ns/op 2250 B/op 13 allocs/op -### High Priority 🔴 -1. **Key Generation** - 80x slower than NBD - - Current: 37.7 µs - - Target: ~500 ns (similar to NBD) - - Impact: Critical for client applications +BenchmarkEventSerialize_NWIO-8 451017 2967 ns/op 360 B/op 7 allocs/op +BenchmarkEventSerialize_NBD-8 1265192 1066 ns/op 208 B/op 2 allocs/op +BenchmarkEventSerialize_Fiat-8 907284 1378 ns/op 400 B/op 3 allocs/op -2. **Event Signing** - 2x slower than competitors - - Current: 130 µs - - Target: ~60 µs (match NBD/Fiat) - - Impact: High for client applications +BenchmarkComputeID_NWIO-8 250470 4425 ns/op 488 B/op 9 allocs/op +BenchmarkComputeID_NBD-8 391370 3062 ns/op 336 B/op 4 allocs/op +BenchmarkComputeID_Fiat-8 339658 3677 ns/op 400 B/op 3 allocs/op -### Medium Priority 🟡 -3. **Event Serialization** - 3x slower than NBD - - Current: 391 ns - - Target: ~130 ns (match NBD) - - Impact: Used in ID computation +BenchmarkGenerateKey_NWIO-8 46 22596880 ns/op 1682613 B/op 27259 allocs/op +BenchmarkGenerateKey_NBD-8 857683 2643 ns/op 368 B/op 8 allocs/op +BenchmarkGenerateKey_Fiat-8 23874 49726 ns/op 272 B/op 5 allocs/op -4. **ID Computation** - 2x slower than competitors - - Current: 608 ns - - Target: ~280 ns (match Fiat) - - Impact: Affects every event processing +BenchmarkEventSign_NWIO-8 38 49364702 ns/op 3403147 B/op 55099 allocs/op +BenchmarkEventSign_NBD-8 9332 122518 ns/op 2112 B/op 35 allocs/op +BenchmarkEventSign_Fiat-8 8274 121756 ns/op 1760 B/op 29 allocs/op -## Current Strengths of NWIO ✅ +BenchmarkEventVerify_NWIO-8 26 54485034 ns/op 3310792 B/op 53635 allocs/op +BenchmarkEventVerify_NBD-8 5815 199061 ns/op 624 B/op 11 allocs/op +BenchmarkEventVerify_Fiat-8 5856 198714 ns/op 640 B/op 9 allocs/op -1. **Filter Matching** - 2x faster than Fiat, fastest overall -2. **Event Marshaling** - Fastest with minimal allocations -3. **Event Verification** - Slightly faster than competitors -4. **Memory Efficiency** - Competitive or better in most operations +BenchmarkFilterMatch_NWIO-8 81765290 14.34 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatch_NBD-8 53242167 22.26 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatch_Fiat-8 30670489 38.53 ns/op 0 B/op 0 allocs/op -## Recommendations +BenchmarkFilterMatchComplex_NWIO-8 17972340 66.38 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatchComplex_NBD-8 14769445 74.21 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatchComplex_Fiat-8 12921300 92.83 ns/op 0 B/op 0 allocs/op +``` -### For Relay Implementations -- **NWIO excels**: Best filter matching performance -- All three are competitive for event parsing/verification +## Should You Use NWIO? -### For Client Implementations -- **NBD/Fiat preferred**: Much faster key generation and signing -- NWIO needs optimization in crypto operations +**For learning/reading code:** Yes. Zero dependencies, everything is auditable. -### Overall Assessment -- **NWIO**: Best for relay/filter-heavy workloads -- **NBD**: Most balanced, excellent crypto performance -- **Fiat**: Good all-around, lowest memory in some operations +**For a side project:** Maybe. 50ms to sign an event is noticeable but tolerable if you're not signing constantly. -## Running Your Own Benchmarks +**For anything serious:** No. Use NBD or Fiat. The crypto performance gap is too large. -```bash -# Run all benchmarks -./run_benchmarks.sh +## Why So Slow? -# Compare specific operations -go test -bench=BenchmarkEventSign -benchmem comparison_bench_test.go -go test -bench=BenchmarkFilterMatch -benchmem comparison_bench_test.go +The internal secp256k1 implementation uses Go's `math/big` for arbitrary-precision arithmetic. Every operation allocates, nothing is constant-time, and there's no assembly optimization. Production libraries like btcec use fixed-width limbs, stack allocation, and hand-tuned assembly. -# Statistical analysis with benchstat -go test -bench=. -count=10 comparison_bench_test.go > results.txt -benchstat results.txt -``` +This is the price of zero dependencies and readable code. diff --git a/benchmarks/benchmark_results.txt b/benchmarks/benchmark_results.txt index c3976e6..2228c25 100644 --- a/benchmarks/benchmark_results.txt +++ b/benchmarks/benchmark_results.txt @@ -1,32 +1,33 @@ goos: linux goarch: amd64 -cpu: AMD Ryzen AI 9 HX PRO 370 w/ Radeon 890M -BenchmarkEventUnmarshal_NWIO-24 498826 2541 ns/op 888 B/op 17 allocs/op -BenchmarkEventUnmarshal_NBD-24 423019 2832 ns/op 944 B/op 13 allocs/op -BenchmarkEventUnmarshal_Fiat-24 430042 2545 ns/op 752 B/op 10 allocs/op -BenchmarkEventMarshal_NWIO-24 613165 1790 ns/op 1010 B/op 3 allocs/op -BenchmarkEventMarshal_NBD-24 620986 1819 ns/op 1500 B/op 6 allocs/op -BenchmarkEventMarshal_Fiat-24 621964 1971 ns/op 2254 B/op 13 allocs/op -BenchmarkEventSerialize_NWIO-24 3059661 391.0 ns/op 360 B/op 7 allocs/op -BenchmarkEventSerialize_NBD-24 8824029 128.8 ns/op 208 B/op 2 allocs/op -BenchmarkEventSerialize_Fiat-24 6533536 160.9 ns/op 400 B/op 3 allocs/op -BenchmarkComputeID_NWIO-24 2108437 608.0 ns/op 488 B/op 9 allocs/op -BenchmarkComputeID_NBD-24 4072243 302.2 ns/op 336 B/op 4 allocs/op -BenchmarkComputeID_Fiat-24 4421660 275.9 ns/op 400 B/op 3 allocs/op -BenchmarkGenerateKey_NWIO-24 31942 37689 ns/op 208 B/op 4 allocs/op -BenchmarkGenerateKey_NBD-24 2489169 469.6 ns/op 369 B/op 8 allocs/op -BenchmarkGenerateKey_Fiat-24 45475 25375 ns/op 272 B/op 5 allocs/op -BenchmarkEventSign_NWIO-24 9072 129854 ns/op 2363 B/op 42 allocs/op -BenchmarkEventSign_NBD-24 20325 59069 ns/op 2112 B/op 35 allocs/op -BenchmarkEventSign_Fiat-24 20613 58572 ns/op 1760 B/op 29 allocs/op -BenchmarkEventVerify_NWIO-24 12009 99744 ns/op 953 B/op 19 allocs/op -BenchmarkEventVerify_NBD-24 10000 105995 ns/op 624 B/op 11 allocs/op -BenchmarkEventVerify_Fiat-24 10000 103744 ns/op 640 B/op 9 allocs/op -BenchmarkFilterMatch_NWIO-24 167376669 7.091 ns/op 0 B/op 0 allocs/op -BenchmarkFilterMatch_NBD-24 100000000 10.82 ns/op 0 B/op 0 allocs/op -BenchmarkFilterMatch_Fiat-24 71761591 16.40 ns/op 0 B/op 0 allocs/op -BenchmarkFilterMatchComplex_NWIO-24 39214178 30.88 ns/op 0 B/op 0 allocs/op -BenchmarkFilterMatchComplex_NBD-24 35580048 33.40 ns/op 0 B/op 0 allocs/op -BenchmarkFilterMatchComplex_Fiat-24 28026481 42.64 ns/op 0 B/op 0 allocs/op +pkg: code.northwest.io/nostr/benchmarks/comparison +cpu: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz +BenchmarkEventUnmarshal_NWIO-8 163561 10400 ns/op 888 B/op 17 allocs/op +BenchmarkEventUnmarshal_NBD-8 94100 10985 ns/op 944 B/op 13 allocs/op +BenchmarkEventUnmarshal_Fiat-8 128083 8498 ns/op 752 B/op 10 allocs/op +BenchmarkEventMarshal_NWIO-8 136460 8857 ns/op 1009 B/op 3 allocs/op +BenchmarkEventMarshal_NBD-8 87494 12326 ns/op 1497 B/op 6 allocs/op +BenchmarkEventMarshal_Fiat-8 203302 13049 ns/op 2250 B/op 13 allocs/op +BenchmarkEventSerialize_NWIO-8 451017 2967 ns/op 360 B/op 7 allocs/op +BenchmarkEventSerialize_NBD-8 1265192 1066 ns/op 208 B/op 2 allocs/op +BenchmarkEventSerialize_Fiat-8 907284 1378 ns/op 400 B/op 3 allocs/op +BenchmarkComputeID_NWIO-8 250470 4425 ns/op 488 B/op 9 allocs/op +BenchmarkComputeID_NBD-8 391370 3062 ns/op 336 B/op 4 allocs/op +BenchmarkComputeID_Fiat-8 339658 3677 ns/op 400 B/op 3 allocs/op +BenchmarkGenerateKey_NWIO-8 46 22596880 ns/op 1682613 B/op 27259 allocs/op +BenchmarkGenerateKey_NBD-8 857683 2643 ns/op 368 B/op 8 allocs/op +BenchmarkGenerateKey_Fiat-8 23874 49726 ns/op 272 B/op 5 allocs/op +BenchmarkEventSign_NWIO-8 38 49364702 ns/op 3403147 B/op 55099 allocs/op +BenchmarkEventSign_NBD-8 9332 122518 ns/op 2112 B/op 35 allocs/op +BenchmarkEventSign_Fiat-8 8274 121756 ns/op 1760 B/op 29 allocs/op +BenchmarkEventVerify_NWIO-8 26 54485034 ns/op 3310792 B/op 53635 allocs/op +BenchmarkEventVerify_NBD-8 5815 199061 ns/op 624 B/op 11 allocs/op +BenchmarkEventVerify_Fiat-8 5856 198714 ns/op 640 B/op 9 allocs/op +BenchmarkFilterMatch_NWIO-8 81765290 14.34 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatch_NBD-8 53242167 22.26 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatch_Fiat-8 30670489 38.53 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatchComplex_NWIO-8 17972340 66.38 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatchComplex_NBD-8 14769445 74.21 ns/op 0 B/op 0 allocs/op +BenchmarkFilterMatchComplex_Fiat-8 12921300 92.83 ns/op 0 B/op 0 allocs/op PASS -ok command-line-arguments 40.651s +ok code.northwest.io/nostr/benchmarks/comparison 44.108s diff --git a/benchmarks/comparison/go.mod b/benchmarks/comparison/go.mod index f76c374..a9ef4c1 100644 --- a/benchmarks/comparison/go.mod +++ b/benchmarks/comparison/go.mod @@ -3,7 +3,7 @@ module code.northwest.io/nostr/benchmarks/comparison go 1.25 require ( - code.northwest.io/nostr v0.0.0 + code.northwest.io/nostr v0.3.0 fiatjaf.com/nostr v0.0.0-20260211144128-7a4b71b39b12 github.com/nbd-wtf/go-nostr v0.52.3 ) @@ -35,5 +35,3 @@ require ( golang.org/x/exp v0.0.0-20250305212735-054e65f0b394 // indirect golang.org/x/sys v0.35.0 // indirect ) - -replace code.northwest.io/nostr => ../.. diff --git a/benchmarks/comparison/go.sum b/benchmarks/comparison/go.sum index fb3bb78..d3c35d7 100644 --- a/benchmarks/comparison/go.sum +++ b/benchmarks/comparison/go.sum @@ -1,3 +1,5 @@ +code.northwest.io/nostr v0.3.0 h1:axs19m/gAVhElK/WXB3wOsK/lnXSSp7rKc8SoJ2V9uQ= +code.northwest.io/nostr v0.3.0/go.mod h1:lQ14gtU7q/YAehyprzuiwtH02KnmZOy7ZBkKrTVSfm4= fiatjaf.com/nostr v0.0.0-20260211144128-7a4b71b39b12 h1:lNVaw/O5ThXVzO0Pz7D+b9fys/OaVaDG3C10kCJQFvg= fiatjaf.com/nostr v0.0.0-20260211144128-7a4b71b39b12/go.mod h1:ue7yw0zHfZj23Ml2kVSdBx0ENEaZiuvGxs/8VEN93FU= github.com/ImVexed/fasturl v0.0.0-20230304231329-4e41488060f3 h1:ClzzXMDDuUbWfNNZqGeYq4PnYOlwlOVIvSyNaIy0ykg= -- cgit v1.2.3