# Axon Protocol — High Level Design > A Nostr-inspired event relay protocol for AI agent infrastructure. Retains the core architectural insight — signed events, relay as message bus, filtered subscriptions — while cleaning up the crypto, encoding, and type system. --- ## Core Insight The relay is **Kafka at the edge, plus identity**. It is a log, not a database. It routes signed events between clients and stores them for replay. It is structurally incapable of understanding content it was not designed to index. --- ## Architecture ``` [client A] ──publish──▶ [relay] ──fanout──▶ [client B] │ [index] ← id, pubkey, kind, created_at, tags [store] ← raw msgpack bytes (opaque) ``` Consumers (agents, indexers, report jobs) subscribe to the relay and maintain their own materialized views. The relay never aggregates, summarizes, or transforms content. Derived data is always downstream. --- ## Event Structure ``` Event { id bytes // 32 bytes, SHA256 of canonical signing payload pubkey bytes // 32 bytes, Ed25519 public key created_at int64 // unix timestamp kind uint16 // see Event Kinds registry content bytes // opaque to the relay; msgpack bin type, no UTF-8 assumption sig bytes // 64 bytes, Ed25519 signature over id tags []Tag } Tag { name string values []string } ``` ### Signing The event ID is the `SHA256` of a canonical byte payload. All integers are big-endian. All strings are UTF-8. `||` denotes concatenation. ``` id = SHA256(canonical_payload) sig = ed25519.Sign(privkey, id) ``` **canonical_payload:** | Field | Encoding | |---|---| | pubkey | `uint16(32)` \|\| 32 bytes | | created_at | `uint64` | | kind | `uint16` | | content | `uint32(len)` \|\| UTF-8 bytes | | tags | see below | **canonical_tags:** Tags are sorted by `name` lexicographically (byte order). For ties on `name`, sort by first value lexicographically. Two tags sharing the same `name` and same first value is a **protocol error** — the relay must reject the event with `400`. Tags are effectively keyed on (name, first value); duplicates are a bug or an attack. ``` uint16(num_tags) for each tag (in sorted order): uint16(len(name)) || utf8(name) uint16(num_values) for each value: uint32(len(value)) || utf8(value) ``` The `tags` field in `canonical_payload` is `SHA256(canonical_tags)` — a fixed 32-byte commitment regardless of tag count. Implementations may cache this hash to avoid re-sorting on repeated signature verification. **Full canonical_payload byte layout:** ``` [0:2] uint16 = 32 pubkey length — always 32 for Ed25519; validate and reject if not 32; reserved for future key types [2:34] bytes pubkey [34:42] uint64 created_at [42:44] uint16 kind [44:48] uint32 content length — wire format supports up to ~4GB but relay enforces a maximum of 65536 bytes (64KB); larger events are rejected with 413 [48:48+n] bytes content (n bytes, n ≤ 65536) [48+n:80+n] bytes SHA256(canonical_tags), 32 bytes ``` Two implementations that agree on this layout will always produce the same `id` for the same event. --- ## Crypto | Purpose | Algorithm | Go package | |---|---|---| | Signing | Ed25519 | `crypto/ed25519` (stdlib) | | Key exchange | X25519 | `golang.org/x/crypto/curve25519` | | Encryption | ChaCha20-Poly1305 | `golang.org/x/crypto/chacha20poly1305` | | Hashing / event ID | SHA-256 | `crypto/sha256` (stdlib) | All dependencies are from the Go standard library or `golang.org/x/crypto`. No third-party crypto. Ed25519 keys are converted to X25519 for ECDH — one keypair serves both signing and encryption. ChaCha20-Poly1305 provides authenticated encryption (AEAD); the ciphertext cannot be tampered with without detection. --- ## Wire Format **Transport:** WebSocket (binary frames) **Serialization:** MessagePack MessagePack is binary JSON — identical data model, no schema, no codegen. Binary fields (`id`, `pubkey`, `sig`) are raw bytes on the wire, eliminating base64 encoding and simplifying the signing story. ### Connection Authentication Authentication happens immediately on connect before any other messages are accepted. ``` relay → Challenge { nonce: bytes } // 32 random bytes client → Auth { pubkey: bytes, sig: bytes } relay → Ok { message: string } // or Error then close ``` The client signs over `nonce || relay_url` to prevent replay to a different relay: ``` sig = ed25519.Sign(privkey, SHA256(nonce || utf8(relay_url))) ``` The relay verifies the signature then checks the pubkey against its allowlist. Failures return `Error { code: 401 }` and close the connection. **Allowlist:** the relay maintains a set of authorized pubkeys in config or the local database. Publish and subscribe are both gated on allowlist membership. Adding a user means adding their pubkey — no passwords, no tokens, no certificate infrastructure. ### Client → Relay ``` Auth { pubkey: bytes, sig: bytes } Subscribe { sub_id: string, filter: Filter } Unsubscribe { sub_id: string } Publish { event: Event } ``` ### Relay → Client ``` Challenge { nonce: bytes } EventEnvelope { sub_id: string, event: Event } Eose { sub_id: string } Ok { message: string } Error { code: uint16, message: string } ``` Each message is a msgpack array: `[message_type, payload]` where `message_type` is a uint16. ### Error Codes HTTP status codes, reused for familiarity. | Code | Meaning | |---|---| | 400 | Bad request (malformed message, invalid signature) | | 401 | Not authenticated | | 403 | Not authorized (pubkey not in allowlist) | | 409 | Duplicate event | | 413 | Message too large | The relay sends `Error` and keeps the connection open for recoverable conditions (e.g. a bad publish). For unrecoverable conditions (e.g. auth failure) it sends `Error` then closes. ### Keepalive The relay sends a WebSocket ping every **30 seconds**. Clients must respond with a pong. Connections that miss two consecutive pings (60 seconds) are closed. Clients may also send pings; the relay will pong. --- ## Filters ``` Filter { ids []bytes // match by event id authors []bytes // match by pubkey kinds []uint16 // match by event kind since int64 until int64 limit int32 tags []TagFilter } TagFilter { name string values []string // match any } ``` --- ## Relay Internals The relay unmarshals only what it needs for indexing and routing. `content` is never parsed — it is opaque bytes as far as the relay is concerned. **On ingest:** 1. Unmarshal the event envelope to extract index fields (`id`, `pubkey`, `kind`, `created_at`, `tags`) 2. Verify signature: recompute `id`, check `ed25519.Verify(pubkey, id, sig)` 3. Reject if `id` already exists — `id PRIMARY KEY` makes duplicate events impossible to store, and the fanout path checks an in-memory seen set before forwarding 4. Write index fields to the index tables 5. Write the verbatim msgpack envelope bytes to `envelope_bytes` — the entire event exactly as received, not re-serialized 6. Fanout to matching subscribers **On query/fanout:** - Read `envelope_bytes` from store - Forward directly to subscribers — no unmarshal, no remarshal **Index schema (SQLite or Postgres):** ```sql CREATE TABLE events ( id BLOB PRIMARY KEY, pubkey BLOB NOT NULL, created_at INTEGER NOT NULL, kind INTEGER NOT NULL, envelope_bytes BLOB NOT NULL -- verbatim msgpack bytes of the full event, including content ); CREATE TABLE tags ( event_id BLOB REFERENCES events(id), name TEXT NOT NULL, value TEXT NOT NULL ); CREATE INDEX ON events(pubkey); CREATE INDEX ON events(kind); CREATE INDEX ON events(created_at); CREATE INDEX ON tags(name, value); ``` --- ## Event Kinds Integer kinds with named constants. The integer is the wire format; the name is what appears in code and logs. Ranges enable efficient category queries without enumerating individual kinds. ### Range Allocation | Range | Category | |---|---| | 0000 – 0999 | Identity & meta | | 1000 – 1999 | Messaging | | 2000 – 2999 | Encrypted messaging | | 3000 – 3999 | Presence & ephemeral | | 4000 – 4999 | Reserved | | 5000 – 5999 | Job requests | | 6000 – 6999 | Job results | | 7000 – 7999 | Job feedback | | 8000 – 8999 | System / relay | | 9000 – 9999 | Reserved | ### Defined Kinds | Constant | Kind | Description | |---|---|---| | `KindProfile` | 0 | Identity metadata | | `KindMessage` | 1000 | Plain text note | | `KindDM` | 2000 | Encrypted direct message | | `KindProgress` | 3000 | Ephemeral progress/status indicator (thinking, agent steps, job status) | | `KindJobRequest` | 5000 | Request for agent work | | `KindJobFeedback` | 7000 | In-progress status / error | | `KindJobResult` | 6000 | Completed job output | ### Range Queries ```sql -- all job-related events WHERE kind >= 5000 AND kind < 8000 -- ephemeral events (relay does not persist) WHERE kind >= 3000 AND kind < 4000 ``` Ephemeral events (kind 3000–3999) are fanned out to subscribers but never written to the store. --- ## Threading Conversations use explicit `e` tags with mandatory role markers: ``` Tag{ name: "e", values: ["", "root"] } Tag{ name: "e", values: ["", "reply"] } ``` Root marker is required on all replies. No fallback heuristics. --- ## Direct Messages `KindDM` (2000) events carry ChaCha20-Poly1305 encrypted content. The recipient is identified by a `p` tag carrying their pubkey: ``` Tag{ name: "p", values: [""] } ``` The relay indexes the `p` tag to route DMs to the recipient's subscription. Content is opaque; the relay cannot decrypt it. --- ## Job Protocol Any client can publish a `KindJobRequest`; any agent subscribed to the relay can fulfill it. The flow: ``` KindJobRequest (5000) → { kind: 5000, content: "", tags: [["t", ""]] } KindJobFeedback (7000) → { kind: 7000, content: "", tags: [["e", ""]] } KindJobResult (6000) → { kind: 6000, content: "", tags: [["e", ""]] } ``` Multiple agents can compete to fulfill the same request. The requester can target a specific agent with a `p` tag. **Expiry:** job requests may include an `expires_at` tag carrying a unix timestamp. Agents must check this before starting work and skip expired requests. The relay does not enforce expiry — it is agent-side policy. ``` Tag{ name: "expires_at", values: [""] } ``` --- ## Consumers The relay is the log. Anything requiring derived data subscribes and maintains its own view: - **Search indexer** — subscribes to all events, feeds full-text index - **Daily report** — subscribes to past 24h, generates summary via agent - **Metrics collector** — counts event types, feeds dashboard - **Conversation summarizer** — subscribes to completed threads Each consumer is independent and can rebuild from relay replay on restart. **Resumption:** consumers track their own position by storing the `created_at` of the last processed event and resuming with a `since` filter on restart. Use event `id` to deduplicate any overlap at the boundary. --- ## Threat Model **DM metadata:** `KindDM` content is encrypted and opaque to the relay, but sender pubkey and recipient `p` tag are stored in plaintext. The relay operator can see who is talking to whom and when. Content is private; the social graph is not. --- ## What This Is Not - Not a database. Don't query it like one. - Not a general message queue. It has no consumer groups or offset tracking — consumers manage their own position. - Not decentralized. Single relay, single operator. Multi-relay federation is out of scope.