aboutsummaryrefslogtreecommitdiffstats
path: root/research.md
diff options
context:
space:
mode:
authorbndw <ben@bdw.to>2026-02-28 19:14:01 -0800
committerbndw <ben@bdw.to>2026-02-28 19:14:01 -0800
commit9a636af9090b122db2e55737fca3e78550aab9df (patch)
treef76f3b118b525907e92fb29df096567a6eeabd06 /research.md
parente2a0bc68726c1b8dca179ee1f6826b88d8dd09f5 (diff)
fix: scope artifacts to sessions
Diffstat (limited to 'research.md')
-rw-r--r--research.md414
1 files changed, 0 insertions, 414 deletions
diff --git a/research.md b/research.md
deleted file mode 100644
index b7d0897..0000000
--- a/research.md
+++ /dev/null
@@ -1,414 +0,0 @@
1# Research: Claude Flow Architecture
2
3## Existing Codebase Analysis
4
5### Template Structure
6
7The starting point is a minimal Electron + Vite + React + better-sqlite3 template:
8
9```
10minimal-electron-vite-react-better-sqlite/
11├── src/main/
12│ ├── index.ts # Electron main process
13│ └── preload.ts # IPC bridge (empty)
14├── renderer/
15│ ├── index.html # Entry HTML
16│ └── src/
17│ └── main.tsx # React entry ("hi")
18├── package.json
19├── tsconfig.json
20└── vite.config.ts
21```
22
23### Key Patterns in Template
24
25**Main Process (`src/main/index.ts`):**
26- Uses `app.isPackaged` for dev/prod detection
27- Database stored in `app.getPath('userData')` — correct for Electron apps
28- Uses WAL mode for SQLite (`journal_mode = WAL`)
29- Window created with `contextIsolation: true`, `nodeIntegration: false` — secure defaults
30- Preload script path: `path.join(__dirname, 'preload.js')`
31
32**Vite Config:**
33- Root is `renderer/` directory
34- Base is `./` for file:// loads in production
35- Dev server on port 5173 with `strictPort: true`
36- Output to `renderer/dist`
37
38**Build/Dev Scripts:**
39- `npm run dev` — concurrent Vite + TypeScript watch + Electron
40- Uses `wait-on tcp:5173` to wait for Vite before launching Electron
41- `@electron/rebuild` handles native module rebuilding
42
43### What's Missing (We Need to Build)
44
451. IPC handlers for renderer → main communication
462. Proper database layer with migrations
473. React components and state management
484. Claude integration
495. Workflow state machine
50
51---
52
53## Claude Agent SDK Research
54
55### Package Information
56
57The SDK has been renamed from "Claude Code SDK" to "Claude Agent SDK".
58
59**NPM Package:** `@anthropic-ai/claude-agent-sdk`
60
61**Installation:**
62```bash
63npm install @anthropic-ai/claude-agent-sdk
64```
65
66**Authentication:**
67- Set `ANTHROPIC_API_KEY` environment variable
68- Also supports Bedrock, Vertex AI, Azure via environment flags
69
70### Core API: `query()`
71
72The primary function returns an async generator that streams messages:
73
74```typescript
75import { query } from "@anthropic-ai/claude-agent-sdk";
76
77for await (const message of query({
78 prompt: "Find and fix the bug in auth.py",
79 options: {
80 allowedTools: ["Read", "Edit", "Bash"],
81 permissionMode: "acceptEdits",
82 cwd: "/path/to/project"
83 }
84})) {
85 console.log(message);
86}
87```
88
89### Built-in Tools
90
91| Tool | Purpose |
92|------|---------|
93| **Read** | Read files (text, images, PDFs, notebooks) |
94| **Write** | Create/overwrite files |
95| **Edit** | Precise string replacements |
96| **Bash** | Run terminal commands |
97| **Glob** | Find files by pattern |
98| **Grep** | Search file contents with regex |
99| **WebSearch** | Search the web |
100| **WebFetch** | Fetch/parse web pages |
101| **Task** | Spawn subagents |
102
103### Permission Modes
104
105| Mode | Behavior |
106|------|----------|
107| `default` | Standard permissions, may prompt |
108| `acceptEdits` | Auto-approve file edits |
109| `bypassPermissions` | Allow everything (dangerous) |
110| `plan` | Planning only, no execution |
111
112**Critical for our workflow:**
113- **Research/Plan/Annotate phases**: Use `plan` mode or restrict `allowedTools` to read-only
114- **Implement phase**: Use `acceptEdits` or `bypassPermissions` // REVIEW: make a user toggle
115
116### System Prompts
117
118Can be customized via options:
119
120```typescript
121options: {
122 systemPrompt: "You are in RESEARCH mode. Read files deeply, write findings to research.md. DO NOT modify any source files.",
123 // OR use preset with append:
124 systemPrompt: {
125 type: "preset",
126 preset: "claude_code",
127 append: "Additional instructions here..."
128 }
129}
130```
131
132### Session Management
133
134Sessions can be resumed using session IDs:
135
136```typescript
137// First query captures session ID
138let sessionId: string;
139for await (const message of query({ prompt: "Read auth module" })) {
140 if (message.type === "system" && message.subtype === "init") {
141 sessionId = message.session_id;
142 }
143}
144
145// Resume later with full context
146for await (const message of query({
147 prompt: "Now find all places that call it",
148 options: { resume: sessionId }
149})) {
150 // Claude remembers the previous conversation
151}
152```
153
154**Key insight:** Sessions persist to disk by default. We can store the session ID in SQLite and resume later.
155
156### Hooks
157
158Hooks intercept agent behavior at key points:
159
160```typescript
161import { query, HookCallback, PreToolUseHookInput } from "@anthropic-ai/claude-agent-sdk";
162
163const blockWrites: HookCallback = async (input, toolUseID, { signal }) => {
164 const preInput = input as PreToolUseHookInput;
165 if (["Write", "Edit"].includes(preInput.tool_name)) {
166 return {
167 hookSpecificOutput: {
168 hookEventName: "PreToolUse",
169 permissionDecision: "deny",
170 permissionDecisionReason: "In planning mode - no code changes allowed"
171 }
172 };
173 }
174 return {};
175};
176
177for await (const message of query({
178 prompt: "...",
179 options: {
180 hooks: {
181 PreToolUse: [{ matcher: "Write|Edit", hooks: [blockWrites] }]
182 }
183 }
184})) { ... }
185```
186
187**Available hook events:**
188- `PreToolUse` — Before tool executes (can block/modify)
189- `PostToolUse` — After tool executes (can log/transform)
190- `Stop` — Agent execution ending
191- `SessionStart` / `SessionEnd` — Session lifecycle
192- `SubagentStart` / `SubagentStop` — Subagent lifecycle
193
194### Message Types
195
196The query yields different message types:
197
198```typescript
199type SDKMessage =
200 | SDKAssistantMessage // Claude's response (includes tool_use)
201 | SDKUserMessage // User input
202 | SDKResultMessage // Final result with usage stats
203 | SDKSystemMessage // Init message with session_id, tools, etc.
204 | SDKPartialMessage // Streaming chunks (if enabled)
205 | SDKStatusMessage // Status updates
206 | ...
207```
208
209**SDKResultMessage** contains:
210- `result` — Final text output
211- `total_cost_usd` — API cost
212- `usage` — Token counts
213- `duration_ms` — Total time
214- `num_turns` — Conversation turns
215
216### Query Object Methods
217
218The query object has methods for control:
219
220```typescript
221const q = query({ prompt: "...", options: { ... } });
222
223// Change settings mid-session
224await q.setPermissionMode("acceptEdits");
225await q.setModel("opus");
226
227// Get session info
228const init = await q.initializationResult();
229const commands = await q.supportedCommands();
230const models = await q.supportedModels();
231
232// Interrupt/cancel
233await q.interrupt();
234q.close();
235```
236
237---
238
239## Architecture Decisions
240
241### Phase Enforcement Strategy
242
243We have two complementary approaches:
244
245**1. Permission Mode + Allowed Tools:**
246```typescript
247const phaseConfig = {
248 research: {
249 permissionMode: "plan",
250 allowedTools: ["Read", "Glob", "Grep", "WebSearch", "WebFetch"],
251 systemPrompt: "You are in RESEARCH mode..."
252 },
253 plan: {
254 permissionMode: "plan",
255 allowedTools: ["Read", "Glob", "Grep", "Write"], // Write only for plan.md
256 systemPrompt: "You are in PLANNING mode..."
257 },
258 annotate: {
259 permissionMode: "plan",
260 allowedTools: ["Read", "Write"], // Only update plan.md
261 systemPrompt: "You are in ANNOTATION mode..."
262 },
263 implement: {
264 permissionMode: "acceptEdits",
265 allowedTools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"],
266 systemPrompt: "You are in IMPLEMENTATION mode..."
267 }
268};
269```
270
271**2. Hooks for Fine-Grained Control:**
272```typescript
273const enforcePhaseHook: HookCallback = async (input, toolUseID, { signal }) => {
274 const { tool_name, tool_input } = input as PreToolUseHookInput;
275 const phase = getCurrentPhase(); // From app state
276
277 if (phase !== "implement" && ["Write", "Edit"].includes(tool_name)) {
278 const filePath = (tool_input as any).file_path;
279 // Allow only plan.md/research.md in non-implement phases
280 if (!filePath.endsWith("plan.md") && !filePath.endsWith("research.md")) {
281 return {
282 hookSpecificOutput: {
283 hookEventName: "PreToolUse",
284 permissionDecision: "deny",
285 permissionDecisionReason: `Cannot modify ${filePath} in ${phase} phase`
286 }
287 };
288 }
289 }
290 return {};
291};
292```
293
294### Session Persistence
295
296**Option A: Use SDK's built-in session persistence**
297- Sessions saved to disk automatically
298- Store session ID in SQLite
299- Resume with `options: { resume: sessionId }`
300- Pro: Simpler, full context preserved
301- Con: Less control over what's stored
302
303**Option B: Store messages in SQLite ourselves**
304- Store each message in our database
305- Reconstruct context when resuming
306- Pro: Full control, searchable history
307- Con: More complex, may lose SDK internal state
308
309**Recommendation:** Use **Option A** (SDK persistence) with session ID in SQLite. We can still store messages for display/search, but rely on SDK for actual context.
310// REVIEW: option a
311
312### Artifact Management
313
314The blog workflow uses `research.md` and `plan.md` as persistent artifacts.
315
316**Options:**
3171. **Store in SQLite** — Searchable, version history, but not editable externally
3182. **Store as files in project** — Visible in git, editable in any editor
3193. **Both** — Files as source of truth, sync to SQLite for search
320
321**Recommendation:** Store as **files in the project directory** (e.g., `.claude-flow/research.md`, `.claude-flow/plan.md`). This matches the blog workflow where the human edits the plan.md directly.
322// REVIEW: recommendation is great
323
324### IPC Architecture
325
326Electron requires IPC for renderer ↔ main communication:
327
328```typescript
329// preload.ts
330import { contextBridge, ipcRenderer } from "electron";
331
332contextBridge.exposeInMainWorld("api", {
333 // Projects
334 listProjects: () => ipcRenderer.invoke("projects:list"),
335 createProject: (data) => ipcRenderer.invoke("projects:create", data),
336
337 // Sessions
338 listSessions: (projectId) => ipcRenderer.invoke("sessions:list", projectId),
339 createSession: (data) => ipcRenderer.invoke("sessions:create", data),
340
341 // Claude
342 sendMessage: (sessionId, message) => ipcRenderer.invoke("claude:send", sessionId, message),
343 onMessage: (callback) => ipcRenderer.on("claude:message", callback),
344 setPhase: (sessionId, phase) => ipcRenderer.invoke("claude:setPhase", sessionId, phase),
345});
346```
347
348---
349
350## Open Questions Resolved
351
352### 1. Claude Code SDK vs CLI
353
354**Answer:** Use the SDK (`@anthropic-ai/claude-agent-sdk`). It provides:
355- Programmatic control via TypeScript
356- Hooks for intercepting behavior
357- Session management
358- Streaming messages
359
360### 2. Artifact Storage
361
362**Answer:** Store as files in project directory (`.claude-flow/`) for:
363- Editability in any editor (VSCode, etc.)
364- Git visibility
365- Matches the blog workflow
366
367### 3. Session Context / Compaction
368
369**Answer:** Use SDK's built-in session persistence:
370- Store session ID in SQLite
371- Resume with `options: { resume: sessionId }`
372- SDK handles context compaction automatically
373
374### 4. Multi-file Editing
375
376**Answer:** The SDK handles this natively via the Edit/Write tools. The plan.md should list all files to be modified, and Claude executes them in order during implementation.
377
378---
379
380## Dependencies to Add
381
382```json
383{
384 "dependencies": {
385 "@anthropic-ai/claude-agent-sdk": "latest",
386 "better-sqlite3": "12.2.0",
387 "react": "^19.1.1",
388 "react-dom": "^19.1.1",
389 "uuid": "^11.0.0"
390 },
391 "devDependencies": {
392 "@types/uuid": "^10.0.0"
393 // ... existing devDeps
394 }
395}
396```
397
398---
399
400## Summary
401
402The Claude Agent SDK is well-suited for building Claude Flow:
403
4041. **Session management** — Built-in persistence, resume capability
4052. **Permission modes** — `plan` mode prevents execution, `acceptEdits` for implementation
4063. **Hooks** — Fine-grained control over what Claude can do
4074. **Streaming** — Real-time message display in UI
4085. **System prompts** — Customizable per phase
409
410The main work is:
411- Building the Electron app shell (IPC, windows)
412- SQLite layer for projects/sessions
413- React UI for chat, artifacts, phase controls
414- Wiring everything together with the SDK