# Rate Limiting

This package provides per-user rate limiting for gRPC endpoints using the token bucket algorithm.

## Overview

Rate limiting prevents abuse and ensures fair resource allocation across users. This implementation:

- **Per-user quotas**: Different limits for each authenticated pubkey
- **IP-based fallback**: Rate limit unauthenticated requests by IP address
- **Method-specific limits**: Different quotas for different operations (e.g., stricter limits for PublishEvent)
- **Token bucket algorithm**: Allows bursts while maintaining average rate
- **Standard gRPC errors**: Returns `ResourceExhausted` (HTTP 429) when limits exceeded

## How It Works

### Token Bucket Algorithm

Each user (identified by pubkey or IP) has a "bucket" of tokens:

1. **Tokens refill** at a configured rate (e.g., 10 requests/second)
2. **Each request consumes** one token
3. **Bursts allowed** up to bucket capacity (e.g., 20 tokens)
4. **Requests blocked** when bucket is empty

Example with 10 req/s limit and 20 token burst:
```
Time 0s: User makes 20 requests → All succeed (burst)
Time 0s: User makes 21st request → Rejected (bucket empty)
Time 1s: Bucket refills by 10 tokens
Time 1s: User makes 10 requests → All succeed
```

### Integration with Authentication

Rate limiting works seamlessly with the auth package:

1. **Authenticated users** (via NIP-98): Rate limited by pubkey
2. **Unauthenticated users**: Rate limited by IP address
3. **Auth interceptor runs first**, making pubkey available to rate limiter

## Usage

### Basic Setup

```go
import (
    "northwest.io/muxstr/internal/auth"
    "northwest.io/muxstr/internal/ratelimit"
    "google.golang.org/grpc"
)

// Configure rate limiter
limiter := ratelimit.New(&ratelimit.Config{
    // Default: 10 requests/second per user, burst of 20
    RequestsPerSecond: 10,
    BurstSize:         20,

    // Unauthenticated users: 5 requests/second per IP
    IPRequestsPerSecond: 5,
    IPBurstSize:         10,
})

// Create server with auth + rate limit interceptors
server := grpc.NewServer(
    grpc.ChainUnaryInterceptor(
        auth.NostrUnaryInterceptor(authOpts),    // Auth runs first
        ratelimit.UnaryInterceptor(limiter),     // Rate limit runs second
    ),
    grpc.ChainStreamInterceptor(
        auth.NostrStreamInterceptor(authOpts),
        ratelimit.StreamInterceptor(limiter),
    ),
)
```

### Method-Specific Limits

Different operations can have different rate limits:

```go
limiter := ratelimit.New(&ratelimit.Config{
    // Default for all methods
    RequestsPerSecond: 10,
    BurstSize:         20,

    // Override for specific methods
    MethodLimits: map[string]ratelimit.MethodLimit{
        "/nostr.v1.NostrRelay/PublishEvent": {
            RequestsPerSecond: 2,   // Stricter: only 2 publishes/sec
            BurstSize:         5,
        },
        "/nostr.v1.NostrRelay/Subscribe": {
            RequestsPerSecond: 1,   // Only 1 new subscription/sec
            BurstSize:         3,
        },
        "/nostr.v1.NostrRelay/QueryEvents": {
            RequestsPerSecond: 20,  // More lenient: 20 queries/sec
            BurstSize:         50,
        },
    },
})
```

### Per-User Custom Limits

Set different limits for specific users:

```go
limiter := ratelimit.New(&ratelimit.Config{
    RequestsPerSecond: 10,
    BurstSize:         20,

    // VIP users get higher limits
    UserLimits: map[string]ratelimit.UserLimit{
        "vip-pubkey-abc123": {
            RequestsPerSecond: 100,
            BurstSize:         200,
        },
        "premium-pubkey-def456": {
            RequestsPerSecond: 50,
            BurstSize:         100,
        },
    },
})
```

### Disable Rate Limiting for Specific Methods

```go
limiter := ratelimit.New(&ratelimit.Config{
    RequestsPerSecond: 10,
    BurstSize:         20,

    // Don't rate limit these methods
    SkipMethods: []string{
        "/grpc.health.v1.Health/Check",
    },
})
```

## Configuration Reference

### Config

- **`RequestsPerSecond`**: Default rate limit (tokens per second)
- **`BurstSize`**: Maximum burst size (bucket capacity)
- **`IPRequestsPerSecond`**: Rate limit for unauthenticated users (per IP)
- **`IPBurstSize`**: Burst size for IP-based limits
- **`MethodLimits`**: Map of method-specific overrides
- **`UserLimits`**: Map of per-user custom limits (by pubkey)
- **`SkipMethods`**: Methods that bypass rate limiting
- **`CleanupInterval`**: How often to remove idle limiters (default: 5 minutes)

### MethodLimit

- **`RequestsPerSecond`**: Rate limit for this method
- **`BurstSize`**: Burst size for this method

### UserLimit

- **`RequestsPerSecond`**: Rate limit for this user
- **`BurstSize`**: Burst size for this user
- **`MethodLimits`**: Optional method overrides for this user

## Error Handling

When rate limit is exceeded, the interceptor returns:

```
Code: ResourceExhausted (HTTP 429)
Message: "rate limit exceeded for <pubkey/IP>"
```

Clients should implement exponential backoff:

```go
for {
    resp, err := client.PublishEvent(ctx, req)
    if err != nil {
        if status.Code(err) == codes.ResourceExhausted {
            // Rate limited - wait and retry
            time.Sleep(backoff)
            backoff *= 2
            continue
        }
        return err
    }
    return resp, nil
}
```

## Monitoring

The rate limiter tracks:

- **Active limiters**: Number of users being tracked
- **Requests allowed**: Total requests that passed
- **Requests denied**: Total requests that were rate limited

Access stats:

```go
stats := limiter.Stats()
fmt.Printf("Active users: %d\n", stats.ActiveLimiters)
fmt.Printf("Allowed: %d, Denied: %d\n", stats.Allowed, stats.Denied)
fmt.Printf("Denial rate: %.2f%%\n", stats.DenialRate())
```

## Performance Considerations

### Memory Usage

Each tracked user (pubkey or IP) consumes ~200 bytes. With 10,000 active users:
- Memory: ~2 MB
- Lookup: O(1) with sync.RWMutex

Idle limiters are cleaned up periodically (default: every 5 minutes).

### Throughput

Rate limiting adds minimal overhead:
- Token check: ~100 nanoseconds
- Lock contention: Read lock for lookups, write lock for new users only

Benchmark results (on typical hardware):
```
BenchmarkRateLimitAllow-8    20000000    85 ns/op
BenchmarkRateLimitDeny-8     20000000    82 ns/op
```

### Distributed Deployments

This implementation is **in-memory** and works for single-instance deployments.

For distributed deployments across multiple relay instances:

**Option 1: Accept per-instance limits** (simplest)
- Each instance tracks its own limits
- Users get N × limit if they connect to N different instances
- Usually acceptable for most use cases

**Option 2: Shared Redis backend** (future enhancement)
- Centralized rate limiting across all instances
- Requires Redis dependency
- Adds network latency (~1-2ms per request)

**Option 3: Sticky sessions** (via load balancer)
- Route users to the same instance
- Per-instance limits become per-user limits
- No coordination needed

## Example: Relay with Tiered Access

```go
// Free tier: 10 req/s, strict publish limits
// Premium tier: 50 req/s, relaxed limits
// Admin tier: No limits

func setupRateLimit() *ratelimit.Limiter {
    return ratelimit.New(&ratelimit.Config{
        // Free tier defaults
        RequestsPerSecond: 10,
        BurstSize:         20,

        MethodLimits: map[string]ratelimit.MethodLimit{
            "/nostr.v1.NostrRelay/PublishEvent": {
                RequestsPerSecond: 2,
                BurstSize:         5,
            },
        },

        // Premium users
        UserLimits: map[string]ratelimit.UserLimit{
            "premium-user-1": {
                RequestsPerSecond: 50,
                BurstSize:         100,
            },
        },

        // Admins bypass limits
        SkipMethods: []string{},
        SkipUsers: []string{
            "admin-pubkey-abc",
        },
    })
}
```

## Best Practices

1. **Set conservative defaults**: Start with low limits and increase based on usage
2. **Monitor denial rates**: High denial rates indicate limits are too strict
3. **Method-specific tuning**: Writes (PublishEvent) should be stricter than reads
4. **Burst allowance**: Set burst = 2-3× rate to handle legitimate traffic spikes
5. **IP-based limits**: Set lower than authenticated limits to encourage auth
6. **Cleanup interval**: Balance memory usage vs. repeated user setup overhead

## Security Considerations

### Rate Limit Bypass

Rate limiting can be bypassed by:
- Using multiple pubkeys (Sybil attack)
- Using multiple IPs (distributed attack)

Mitigations:
- Require proof-of-work for new pubkeys
- Monitor for suspicious patterns (many low-activity accounts)
- Implement global rate limits in addition to per-user limits

### DoS Protection

Rate limiting helps with DoS but isn't sufficient alone:
- Combine with connection limits
- Implement request size limits
- Use timeouts and deadlines
- Consider L3/L4 DDoS protection (CloudFlare, etc.)

## Integration with NIP-98 Auth

Rate limiting works naturally with authentication:

```
Request flow:
1. Request arrives
2. Auth interceptor validates NIP-98 event → extracts pubkey
3. Rate limit interceptor checks quota for pubkey
4. If allowed → handler processes request
5. If denied → return ResourceExhausted error
```

For unauthenticated requests:
```
1. Request arrives
2. Auth interceptor allows (if Required: false)
3. Rate limit interceptor uses IP address
4. Check quota for IP → likely stricter limits
```

This encourages users to authenticate to get better rate limits!