internal/ratelimit/README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341

# Rate Limiting

This package provides per-user rate limiting for gRPC endpoints using the token bucket algorithm.

## Overview

Rate limiting prevents abuse and ensures fair resource allocation across users. This implementation:

- **Per-user quotas**: Different limits for each authenticated pubkey
- **IP-based fallback**: Rate limit unauthenticated requests by IP address
- **Method-specific limits**: Different quotas for different operations (e.g., stricter limits for PublishEvent)
- **Token bucket algorithm**: Allows bursts while maintaining average rate
- **Standard gRPC errors**: Returns `ResourceExhausted` (HTTP 429) when limits exceeded

## How It Works

### Token Bucket Algorithm

Each user (identified by pubkey or IP) has a "bucket" of tokens:

1. **Tokens refill** at a configured rate (e.g., 10 requests/second)
2. **Each request consumes** one token
3. **Bursts allowed** up to bucket capacity (e.g., 20 tokens)
4. **Requests blocked** when bucket is empty

Example with 10 req/s limit and 20 token burst:
```
Time 0s: User makes 20 requests → All succeed (burst)
Time 0s: User makes 21st request → Rejected (bucket empty)
Time 1s: Bucket refills by 10 tokens
Time 1s: User makes 10 requests → All succeed
```

### Integration with Authentication

Rate limiting works seamlessly with the auth package:

1. **Authenticated users** (via NIP-98): Rate limited by pubkey
2. **Unauthenticated users**: Rate limited by IP address
3. **Auth interceptor runs first**, making pubkey available to rate limiter

## Usage

### Basic Setup

```go
import (
    "northwest.io/muxstr/internal/auth"
    "northwest.io/muxstr/internal/ratelimit"
    "google.golang.org/grpc"
)

// Configure rate limiter
limiter := ratelimit.New(&ratelimit.Config{
    // Default: 10 requests/second per user, burst of 20
    RequestsPerSecond: 10,
    BurstSize:         20,

    // Unauthenticated users: 5 requests/second per IP
    IPRequestsPerSecond: 5,
    IPBurstSize:         10,
})

// Create server with auth + rate limit interceptors
server := grpc.NewServer(
    grpc.ChainUnaryInterceptor(
        auth.NostrUnaryInterceptor(authOpts),    // Auth runs first
        ratelimit.UnaryInterceptor(limiter),     // Rate limit runs second
    ),
    grpc.ChainStreamInterceptor(
        auth.NostrStreamInterceptor(authOpts),
        ratelimit.StreamInterceptor(limiter),
    ),
)
```

### Method-Specific Limits

Different operations can have different rate limits:

```go
limiter := ratelimit.New(&ratelimit.Config{
    // Default for all methods
    RequestsPerSecond: 10,
    BurstSize:         20,

    // Override for specific methods
    MethodLimits: map[string]ratelimit.MethodLimit{
        "/nostr.v1.NostrRelay/PublishEvent": {
            RequestsPerSecond: 2,   // Stricter: only 2 publishes/sec
            BurstSize:         5,
        },
        "/nostr.v1.NostrRelay/Subscribe": {
            RequestsPerSecond: 1,   // Only 1 new subscription/sec
            BurstSize:         3,
        },
        "/nostr.v1.NostrRelay/QueryEvents": {
            RequestsPerSecond: 20,  // More lenient: 20 queries/sec
            BurstSize:         50,
        },
    },
})
```

### Per-User Custom Limits

Set different limits for specific users:

```go
limiter := ratelimit.New(&ratelimit.Config{
    RequestsPerSecond: 10,
    BurstSize:         20,

    // VIP users get higher limits
    UserLimits: map[string]ratelimit.UserLimit{
        "vip-pubkey-abc123": {
            RequestsPerSecond: 100,
            BurstSize:         200,
        },
        "premium-pubkey-def456": {
            RequestsPerSecond: 50,
            BurstSize:         100,
        },
    },
})
```

### Disable Rate Limiting for Specific Methods

```go
limiter := ratelimit.New(&ratelimit.Config{
    RequestsPerSecond: 10,
    BurstSize:         20,

    // Don't rate limit these methods
    SkipMethods: []string{
        "/grpc.health.v1.Health/Check",
    },
})
```

## Configuration Reference

### Config

- **`RequestsPerSecond`**: Default rate limit (tokens per second)
- **`BurstSize`**: Maximum burst size (bucket capacity)
- **`IPRequestsPerSecond`**: Rate limit for unauthenticated users (per IP)
- **`IPBurstSize`**: Burst size for IP-based limits
- **`MethodLimits`**: Map of method-specific overrides
- **`UserLimits`**: Map of per-user custom limits (by pubkey)
- **`SkipMethods`**: Methods that bypass rate limiting
- **`CleanupInterval`**: How often to remove idle limiters (default: 5 minutes)

### MethodLimit

- **`RequestsPerSecond`**: Rate limit for this method
- **`BurstSize`**: Burst size for this method

### UserLimit

- **`RequestsPerSecond`**: Rate limit for this user
- **`BurstSize`**: Burst size for this user
- **`MethodLimits`**: Optional method overrides for this user

## Error Handling

When rate limit is exceeded, the interceptor returns:

```
Code: ResourceExhausted (HTTP 429)
Message: "rate limit exceeded for <pubkey/IP>"
```

Clients should implement exponential backoff:

```go
for {
    resp, err := client.PublishEvent(ctx, req)
    if err != nil {
        if status.Code(err) == codes.ResourceExhausted {
            // Rate limited - wait and retry
            time.Sleep(backoff)
            backoff *= 2
            continue
        }
        return err
    }
    return resp, nil
}
```

## Monitoring

The rate limiter tracks:

- **Active limiters**: Number of users being tracked
- **Requests allowed**: Total requests that passed
- **Requests denied**: Total requests that were rate limited

Access stats:

```go
stats := limiter.Stats()
fmt.Printf("Active users: %d\n", stats.ActiveLimiters)
fmt.Printf("Allowed: %d, Denied: %d\n", stats.Allowed, stats.Denied)
fmt.Printf("Denial rate: %.2f%%\n", stats.DenialRate())
```

## Performance Considerations

### Memory Usage

Each tracked user (pubkey or IP) consumes ~200 bytes. With 10,000 active users:
- Memory: ~2 MB
- Lookup: O(1) with sync.RWMutex

Idle limiters are cleaned up periodically (default: every 5 minutes).

### Throughput

Rate limiting adds minimal overhead:
- Token check: ~100 nanoseconds
- Lock contention: Read lock for lookups, write lock for new users only

Benchmark results (on typical hardware):
```
BenchmarkRateLimitAllow-8    20000000    85 ns/op
BenchmarkRateLimitDeny-8     20000000    82 ns/op
```

### Distributed Deployments

This implementation is **in-memory** and works for single-instance deployments.

For distributed deployments across multiple relay instances:

**Option 1: Accept per-instance limits** (simplest)
- Each instance tracks its own limits
- Users get N × limit if they connect to N different instances
- Usually acceptable for most use cases

**Option 2: Shared Redis backend** (future enhancement)
- Centralized rate limiting across all instances
- Requires Redis dependency
- Adds network latency (~1-2ms per request)

**Option 3: Sticky sessions** (via load balancer)
- Route users to the same instance
- Per-instance limits become per-user limits
- No coordination needed

## Example: Relay with Tiered Access

```go
// Free tier: 10 req/s, strict publish limits
// Premium tier: 50 req/s, relaxed limits
// Admin tier: No limits

func setupRateLimit() *ratelimit.Limiter {
    return ratelimit.New(&ratelimit.Config{
        // Free tier defaults
        RequestsPerSecond: 10,
        BurstSize:         20,

        MethodLimits: map[string]ratelimit.MethodLimit{
            "/nostr.v1.NostrRelay/PublishEvent": {
                RequestsPerSecond: 2,
                BurstSize:         5,
            },
        },

        // Premium users
        UserLimits: map[string]ratelimit.UserLimit{
            "premium-user-1": {
                RequestsPerSecond: 50,
                BurstSize:         100,
            },
        },

        // Admins bypass limits
        SkipMethods: []string{},
        SkipUsers: []string{
            "admin-pubkey-abc",
        },
    })
}
```

## Best Practices

1. **Set conservative defaults**: Start with low limits and increase based on usage
2. **Monitor denial rates**: High denial rates indicate limits are too strict
3. **Method-specific tuning**: Writes (PublishEvent) should be stricter than reads
4. **Burst allowance**: Set burst = 2-3× rate to handle legitimate traffic spikes
5. **IP-based limits**: Set lower than authenticated limits to encourage auth
6. **Cleanup interval**: Balance memory usage vs. repeated user setup overhead

## Security Considerations

### Rate Limit Bypass

Rate limiting can be bypassed by:
- Using multiple pubkeys (Sybil attack)
- Using multiple IPs (distributed attack)

Mitigations:
- Require proof-of-work for new pubkeys
- Monitor for suspicious patterns (many low-activity accounts)
- Implement global rate limits in addition to per-user limits

### DoS Protection

Rate limiting helps with DoS but isn't sufficient alone:
- Combine with connection limits
- Implement request size limits
- Use timeouts and deadlines
- Consider L3/L4 DDoS protection (CloudFlare, etc.)

## Integration with NIP-98 Auth

Rate limiting works naturally with authentication:

```
Request flow:
1. Request arrives
2. Auth interceptor validates NIP-98 event → extracts pubkey
3. Rate limit interceptor checks quota for pubkey
4. If allowed → handler processes request
5. If denied → return ResourceExhausted error
```

For unauthenticated requests:
```
1. Request arrives
2. Auth interceptor allows (if Required: false)
3. Rate limit interceptor uses IP address
4. Check quota for IP → likely stricter limits
```

This encourages users to authenticate to get better rate limits!