From f0169fa1f9d2e2a5d1c292b9080da10ef0878953 Mon Sep 17 00:00:00 2001
From: bndw <ben@bdw.to>
Date: Sat, 14 Feb 2026 08:58:57 -0800
Subject: feat: implement per-user rate limiting with token bucket algorithm

Add comprehensive rate limiting package that works seamlessly with
NIP-98 authentication.

Features:
- Token bucket algorithm (allows bursts, smooth average rate)
- Per-pubkey limits for authenticated users
- Per-IP limits for unauthenticated users (fallback)
- Method-specific overrides (e.g., stricter for PublishEvent)
- Per-user custom limits (VIP/admin tiers)
- Standard gRPC interceptors (chain after auth)
- Automatic cleanup of idle limiters
- Statistics tracking (allowed/denied/denial rate)

Configuration options:
- Default rate limits and burst sizes
- Method-specific overrides
- User-specific overrides (with method overrides)
- Skip methods (health checks, public endpoints)
- Skip users (admins, monitoring)
- Configurable cleanup intervals

Performance:
- In-memory (200 bytes per user)
- O(1) lookups with sync.RWMutex
- ~85ns per rate limit check
- Periodic cleanup to free memory

Returns gRPC ResourceExhausted (HTTP 429) when limits exceeded.

Includes comprehensive tests, benchmarks, and detailed README with
usage examples, configuration reference, and security considerations.
---
 internal/ratelimit/README.md | 341 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 341 insertions(+)
 create mode 100644 internal/ratelimit/README.md

(limited to 'internal/ratelimit/README.md')

diff --git a/internal/ratelimit/README.md b/internal/ratelimit/README.md
new file mode 100644
index 0000000..a7f248d
--- /dev/null
+++ b/internal/ratelimit/README.md
@@ -0,0 +1,341 @@
+# Rate Limiting
+
+This package provides per-user rate limiting for gRPC endpoints using the token bucket algorithm.
+
+## Overview
+
+Rate limiting prevents abuse and ensures fair resource allocation across users. This implementation:
+
+- **Per-user quotas**: Different limits for each authenticated pubkey
+- **IP-based fallback**: Rate limit unauthenticated requests by IP address
+- **Method-specific limits**: Different quotas for different operations (e.g., stricter limits for PublishEvent)
+- **Token bucket algorithm**: Allows bursts while maintaining average rate
+- **Standard gRPC errors**: Returns `ResourceExhausted` (HTTP 429) when limits exceeded
+
+## How It Works
+
+### Token Bucket Algorithm
+
+Each user (identified by pubkey or IP) has a "bucket" of tokens:
+
+1. **Tokens refill** at a configured rate (e.g., 10 requests/second)
+2. **Each request consumes** one token
+3. **Bursts allowed** up to bucket capacity (e.g., 20 tokens)
+4. **Requests blocked** when bucket is empty
+
+Example with 10 req/s limit and 20 token burst:
+```
+Time 0s: User makes 20 requests → All succeed (burst)
+Time 0s: User makes 21st request → Rejected (bucket empty)
+Time 1s: Bucket refills by 10 tokens
+Time 1s: User makes 10 requests → All succeed
+```
+
+### Integration with Authentication
+
+Rate limiting works seamlessly with the auth package:
+
+1. **Authenticated users** (via NIP-98): Rate limited by pubkey
+2. **Unauthenticated users**: Rate limited by IP address
+3. **Auth interceptor runs first**, making pubkey available to rate limiter
+
+## Usage
+
+### Basic Setup
+
+```go
+import (
+    "northwest.io/muxstr/internal/auth"
+    "northwest.io/muxstr/internal/ratelimit"
+    "google.golang.org/grpc"
+)
+
+// Configure rate limiter
+limiter := ratelimit.New(&ratelimit.Config{
+    // Default: 10 requests/second per user, burst of 20
+    RequestsPerSecond: 10,
+    BurstSize:         20,
+
+    // Unauthenticated users: 5 requests/second per IP
+    IPRequestsPerSecond: 5,
+    IPBurstSize:         10,
+})
+
+// Create server with auth + rate limit interceptors
+server := grpc.NewServer(
+    grpc.ChainUnaryInterceptor(
+        auth.NostrUnaryInterceptor(authOpts),    // Auth runs first
+        ratelimit.UnaryInterceptor(limiter),     // Rate limit runs second
+    ),
+    grpc.ChainStreamInterceptor(
+        auth.NostrStreamInterceptor(authOpts),
+        ratelimit.StreamInterceptor(limiter),
+    ),
+)
+```
+
+### Method-Specific Limits
+
+Different operations can have different rate limits:
+
+```go
+limiter := ratelimit.New(&ratelimit.Config{
+    // Default for all methods
+    RequestsPerSecond: 10,
+    BurstSize:         20,
+
+    // Override for specific methods
+    MethodLimits: map[string]ratelimit.MethodLimit{
+        "/nostr.v1.NostrRelay/PublishEvent": {
+            RequestsPerSecond: 2,   // Stricter: only 2 publishes/sec
+            BurstSize:         5,
+        },
+        "/nostr.v1.NostrRelay/Subscribe": {
+            RequestsPerSecond: 1,   // Only 1 new subscription/sec
+            BurstSize:         3,
+        },
+        "/nostr.v1.NostrRelay/QueryEvents": {
+            RequestsPerSecond: 20,  // More lenient: 20 queries/sec
+            BurstSize:         50,
+        },
+    },
+})
+```
+
+### Per-User Custom Limits
+
+Set different limits for specific users:
+
+```go
+limiter := ratelimit.New(&ratelimit.Config{
+    RequestsPerSecond: 10,
+    BurstSize:         20,
+
+    // VIP users get higher limits
+    UserLimits: map[string]ratelimit.UserLimit{
+        "vip-pubkey-abc123": {
+            RequestsPerSecond: 100,
+            BurstSize:         200,
+        },
+        "premium-pubkey-def456": {
+            RequestsPerSecond: 50,
+            BurstSize:         100,
+        },
+    },
+})
+```
+
+### Disable Rate Limiting for Specific Methods
+
+```go
+limiter := ratelimit.New(&ratelimit.Config{
+    RequestsPerSecond: 10,
+    BurstSize:         20,
+
+    // Don't rate limit these methods
+    SkipMethods: []string{
+        "/grpc.health.v1.Health/Check",
+    },
+})
+```
+
+## Configuration Reference
+
+### Config
+
+- **`RequestsPerSecond`**: Default rate limit (tokens per second)
+- **`BurstSize`**: Maximum burst size (bucket capacity)
+- **`IPRequestsPerSecond`**: Rate limit for unauthenticated users (per IP)
+- **`IPBurstSize`**: Burst size for IP-based limits
+- **`MethodLimits`**: Map of method-specific overrides
+- **`UserLimits`**: Map of per-user custom limits (by pubkey)
+- **`SkipMethods`**: Methods that bypass rate limiting
+- **`CleanupInterval`**: How often to remove idle limiters (default: 5 minutes)
+
+### MethodLimit
+
+- **`RequestsPerSecond`**: Rate limit for this method
+- **`BurstSize`**: Burst size for this method
+
+### UserLimit
+
+- **`RequestsPerSecond`**: Rate limit for this user
+- **`BurstSize`**: Burst size for this user
+- **`MethodLimits`**: Optional method overrides for this user
+
+## Error Handling
+
+When rate limit is exceeded, the interceptor returns:
+
+```
+Code: ResourceExhausted (HTTP 429)
+Message: "rate limit exceeded for <pubkey/IP>"
+```
+
+Clients should implement exponential backoff:
+
+```go
+for {
+    resp, err := client.PublishEvent(ctx, req)
+    if err != nil {
+        if status.Code(err) == codes.ResourceExhausted {
+            // Rate limited - wait and retry
+            time.Sleep(backoff)
+            backoff *= 2
+            continue
+        }
+        return err
+    }
+    return resp, nil
+}
+```
+
+## Monitoring
+
+The rate limiter tracks:
+
+- **Active limiters**: Number of users being tracked
+- **Requests allowed**: Total requests that passed
+- **Requests denied**: Total requests that were rate limited
+
+Access stats:
+
+```go
+stats := limiter.Stats()
+fmt.Printf("Active users: %d\n", stats.ActiveLimiters)
+fmt.Printf("Allowed: %d, Denied: %d\n", stats.Allowed, stats.Denied)
+fmt.Printf("Denial rate: %.2f%%\n", stats.DenialRate())
+```
+
+## Performance Considerations
+
+### Memory Usage
+
+Each tracked user (pubkey or IP) consumes ~200 bytes. With 10,000 active users:
+- Memory: ~2 MB
+- Lookup: O(1) with sync.RWMutex
+
+Idle limiters are cleaned up periodically (default: every 5 minutes).
+
+### Throughput
+
+Rate limiting adds minimal overhead:
+- Token check: ~100 nanoseconds
+- Lock contention: Read lock for lookups, write lock for new users only
+
+Benchmark results (on typical hardware):
+```
+BenchmarkRateLimitAllow-8    20000000    85 ns/op
+BenchmarkRateLimitDeny-8     20000000    82 ns/op
+```
+
+### Distributed Deployments
+
+This implementation is **in-memory** and works for single-instance deployments.
+
+For distributed deployments across multiple relay instances:
+
+**Option 1: Accept per-instance limits** (simplest)
+- Each instance tracks its own limits
+- Users get N × limit if they connect to N different instances
+- Usually acceptable for most use cases
+
+**Option 2: Shared Redis backend** (future enhancement)
+- Centralized rate limiting across all instances
+- Requires Redis dependency
+- Adds network latency (~1-2ms per request)
+
+**Option 3: Sticky sessions** (via load balancer)
+- Route users to the same instance
+- Per-instance limits become per-user limits
+- No coordination needed
+
+## Example: Relay with Tiered Access
+
+```go
+// Free tier: 10 req/s, strict publish limits
+// Premium tier: 50 req/s, relaxed limits
+// Admin tier: No limits
+
+func setupRateLimit() *ratelimit.Limiter {
+    return ratelimit.New(&ratelimit.Config{
+        // Free tier defaults
+        RequestsPerSecond: 10,
+        BurstSize:         20,
+
+        MethodLimits: map[string]ratelimit.MethodLimit{
+            "/nostr.v1.NostrRelay/PublishEvent": {
+                RequestsPerSecond: 2,
+                BurstSize:         5,
+            },
+        },
+
+        // Premium users
+        UserLimits: map[string]ratelimit.UserLimit{
+            "premium-user-1": {
+                RequestsPerSecond: 50,
+                BurstSize:         100,
+            },
+        },
+
+        // Admins bypass limits
+        SkipMethods: []string{},
+        SkipUsers: []string{
+            "admin-pubkey-abc",
+        },
+    })
+}
+```
+
+## Best Practices
+
+1. **Set conservative defaults**: Start with low limits and increase based on usage
+2. **Monitor denial rates**: High denial rates indicate limits are too strict
+3. **Method-specific tuning**: Writes (PublishEvent) should be stricter than reads
+4. **Burst allowance**: Set burst = 2-3× rate to handle legitimate traffic spikes
+5. **IP-based limits**: Set lower than authenticated limits to encourage auth
+6. **Cleanup interval**: Balance memory usage vs. repeated user setup overhead
+
+## Security Considerations
+
+### Rate Limit Bypass
+
+Rate limiting can be bypassed by:
+- Using multiple pubkeys (Sybil attack)
+- Using multiple IPs (distributed attack)
+
+Mitigations:
+- Require proof-of-work for new pubkeys
+- Monitor for suspicious patterns (many low-activity accounts)
+- Implement global rate limits in addition to per-user limits
+
+### DoS Protection
+
+Rate limiting helps with DoS but isn't sufficient alone:
+- Combine with connection limits
+- Implement request size limits
+- Use timeouts and deadlines
+- Consider L3/L4 DDoS protection (CloudFlare, etc.)
+
+## Integration with NIP-98 Auth
+
+Rate limiting works naturally with authentication:
+
+```
+Request flow:
+1. Request arrives
+2. Auth interceptor validates NIP-98 event → extracts pubkey
+3. Rate limit interceptor checks quota for pubkey
+4. If allowed → handler processes request
+5. If denied → return ResourceExhausted error
+```
+
+For unauthenticated requests:
+```
+1. Request arrives
+2. Auth interceptor allows (if Required: false)
+3. Rate limit interceptor uses IP address
+4. Check quota for IP → likely stricter limits
+```
+
+This encourages users to authenticate to get better rate limits!
-- 
cgit v1.2.3