Agent Orchestrator (AO) is a platform for spawning and managing parallel AI coding agents across distributed systems. It runs multiple agents (Claude Code, Codex, Aider, OpenCode) simultaneously — each in an isolated git worktree with its own PR — and provides a single dashboard to supervise them all. Agents autonomously fix CI failures, address review comments, and manage PRs.
Org: ComposioHQ
Repo: github.com/ComposioHQ/agent-orchestrator
License: MIT
pnpm workspace (v9.15.4) with ~30 packages:
packages/
core/ # Engine: types, config, session manager, lifecycle, plugin registry
cli/ # CLI tool (`ao` command) — depends on all plugins
web/ # Next.js 15 dashboard (App Router, React 19, Tailwind v4)
ao/ # Global CLI wrapper (thin shim around cli)
plugins/
agent-claude-code/ agent-aider/ agent-codex/ agent-opencode/
runtime-tmux/ runtime-process/
workspace-worktree/ workspace-clone/
tracker-github/ tracker-linear/ tracker-gitlab/
scm-github/ scm-gitlab/
notifier-desktop/ notifier-slack/ notifier-webhook/
notifier-composio/ notifier-openclaw/
terminal-iterm2/ terminal-web/
integration-tests/ # E2E tests
Build order: core -> plugins -> cli/web (parallel). pnpm build at root handles this.
| Layer | Stack |
|---|---|
| Language | TypeScript (strict mode, ES2022, Node16 modules) |
| Runtime | Node.js 20+ |
| Package Manager | pnpm 9.15.4 (workspace:* protocol) |
| Web | Next.js 15 (App Router) + React 19 |
| Styling | Tailwind CSS v4 + CSS custom properties (@theme block in globals.css) |
| Terminal UI | xterm.js 5.3.0 + WebSocket to tmux PTYs |
| Validation | Zod |
| Testing | Vitest + @testing-library/react |
| Linting | ESLint 10 (flat config) + Prettier 3.8 |
| CI/CD | GitHub Actions (lint, typecheck, test, release) |
| Versioning | Changesets |
| Git hooks | Husky + gitleaks (secret scanning) |
| Container | OCI via Containerfile (Podman/Docker) |
# Install & build
pnpm install
pnpm build
# Development
pnpm dev # Web dashboard (Next.js + 2 WS servers)
# Type checking
pnpm typecheck # All packages
pnpm --filter @aoagents/ao-web typecheck # Web only
# Testing
pnpm test # All packages (excludes web)
pnpm --filter @aoagents/ao-web test # Web tests
pnpm --filter @aoagents/ao-web test:watch # Web watch mode
pnpm test:integration # Integration tests
# Lint & format
pnpm lint
pnpm lint:fix
pnpm format
pnpm format:checkEvery abstraction is a pluggable interface defined in packages/core/src/types.ts:
| Slot | Default | Purpose |
|---|---|---|
| Runtime | tmux | Where agents execute |
| Agent | claude-code | Which AI tool to use |
| Workspace | worktree | Code isolation (worktree vs clone) |
| Tracker | github | Issue tracking (GitHub, Linear, GitLab) |
| SCM | github | PR, CI, reviews |
| Notifier | desktop | Notification delivery |
| Terminal | iterm2 | Human attachment UI |
| Lifecycle | core (non-pluggable) | State machine + polling |
spawning -> working -> pr_open -> ci_failed / review_pending
| |
changes_requested approved
| |
+-> mergeable -> merged -> cleanup -> done
agent-orchestrator.yaml -> Config Loader (Zod) -> Plugin Registry
-> Session Manager -> Lifecycle Manager (polling loop, state machine)
-> Events -> Notifiers
-> Web API Routes (Next.js) -> SSE (5s interval) + WebSocket (terminal)
-> Dashboard (React + xterm.js)
No database. Flat files + memory:
- Config:
agent-orchestrator.yaml(Zod-validated) - Session metadata:
~/.agent-orchestrator/{hash}-{projectId}/sessions/{sessionId}(key-value pairs) - Worktrees:
~/.agent-orchestrator/{hash}-{projectId}/worktrees/{sessionId}/ - Archives:
~/.agent-orchestrator/{hash}-{projectId}/archive/{sessionId}_{timestamp}
Hash = SHA-256 of config directory (first 12 chars). Prevents collision across multiple checkouts.
- Base prompt (system instructions in core)
- Config prompt (project-specific rules from YAML)
- Rules files (optional
.agent-rules.mdfrom repo)
These behavioral guidelines apply to every agent working on this codebase. They are not optional - they prevent the most common causes of PR rejection and rewrite.
Don't assume. Don't hide confusion. Surface tradeoffs.
- State assumptions explicitly. If uncertain, ask.
- If multiple interpretations of a task exist, present them - don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.
- When editing
lifecycle-manager.tsorsession-manager.ts: state which invariants your change preserves. These files have subtle state dependencies.
Minimum code that solves the problem. Nothing speculative.
- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that wasn't requested.
- No error handling for impossible scenarios.
- Plugin slots are the extension point. Don't add configuration surface when a new plugin is the right answer.
- If you write 200 lines and it could be 50, rewrite it.
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
Touch only what you must. Clean up only your own mess.
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken.
- Match existing style, even if you'd do it differently.
- If your changes create orphans (unused imports, dead variables), remove them.
- Don't remove pre-existing dead code unless asked.
- Every changed line should trace directly to the task description.
This is especially critical in:
types.ts- changing an interface breaks every plugin. Minimize surface changes.globals.css- tokens are consumed across 50+ components. Don't rename casually.lifecycle-manager.ts- state transitions have implicit dependencies. Document why a transition is safe.
Define success criteria. Loop until verified.
Transform tasks into verifiable goals:
- "Add a new status" -> "Add to enum, update
isTerminalSession, add to dashboard column mapping, write tests for all three" - "Fix the bug" -> "Write a test that reproduces it, then make it pass"
- "Refactor X" -> "Ensure tests pass before and after"
For multi-step tasks, state a brief plan:
[Step] -> verify: [check] [Step] -> verify: [check] [Step] -> verify: [check]
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
- TypeScript strict mode — no
anytypes (@typescript-eslint/no-explicit-any: error) - Consistent type imports —
import type { Foo }enforced by ESLint - Immutable patterns — spread operator, never mutate in place
- Prefer const —
no-var,prefer-const - No eval —
no-eval,no-implied-eval,no-new-func - Unused vars — prefix with
_(argsIgnorePattern: "^_")
- Components in flat
components/directory (no nesting) - Hooks in
hooks/withuseprefix - Tests in
__tests__/subdirectories - No barrel files except
core/src/index.ts - Max 400 lines per component file
- PascalCase for components/classes
- camelCase for functions/variables
use*for hooks,is*/has*for booleans
@/alias ->packages/web/src/@aoagents/ao-corefor core importsworkspace:*for cross-package
- Tailwind utility classes only — no inline
style=attributes - CSS custom properties via
var(--color-*)fromglobals.css@themeblock - Dark theme must always be preserved
- No external UI component libraries (no Radix, shadcn, etc.)
- Client components marked
"use client"; server components for pages - State: React hooks only (no Redux/Zustand)
- Real-time updates: SSE via
useSessionEventshook (5s interval, do not change)
- Vitest + @testing-library/react
- Test files:
{Module}.test.tsor{Component}.test.tsxin__tests__/ - Test files for all new components
- Relaxed lint in tests:
anyandconsole.logallowed
- Conventional commits:
feat:,fix:,refactor:,docs:,test:,chore:,perf:,ci: - Changesets for version management
- gitleaks pre-commit hook — never commit secrets
| File | Purpose |
|---|---|
packages/core/src/types.ts |
Central type definitions (all 8 plugin interfaces) |
packages/core/src/session-manager.ts |
Session CRUD operations |
packages/core/src/lifecycle-manager.ts |
State machine + polling loop + reactions |
packages/core/src/config.ts |
YAML config loading with Zod validation |
packages/core/src/plugin-registry.ts |
Plugin discovery and resolution |
packages/core/src/index.ts |
Core public API (stable, do not break) |
packages/web/src/components/Dashboard.tsx |
Main dashboard view |
packages/web/src/components/SessionDetail.tsx |
Session detail view |
packages/web/src/components/DirectTerminal.tsx |
xterm.js terminal with WebSocket |
packages/web/src/components/SessionCard.tsx |
Kanban session card |
packages/web/src/hooks/useSessionEvents.ts |
SSE consumer hook |
packages/web/src/lib/types.ts |
Dashboard types |
packages/web/src/app/globals.css |
Design tokens and base styles (full token definitions) |
DESIGN.md |
Design system reference — design principles, token mapping, component patterns, anti-patterns (read this before writing any web UI) |
agent-orchestrator.yaml |
Project-level config (user-created) |
eslint.config.js |
ESLint flat config |
tsconfig.base.json |
Shared TypeScript base config |
packages/plugins/{slot}-{name}/
├── package.json # @aoagents/ao-plugin-{slot}-{name}
├── tsconfig.json # extends ../../../tsconfig.base.json
├── src/
│ ├── index.ts # manifest + create + detect (default export)
│ └── __tests__/ # vitest tests
- Package:
@aoagents/ao-plugin-{slot}-{name}(lowercase, hyphenated) manifest.namemust match the{name}suffix (e.g. package...-runtime-tmux-> name:"tmux")manifest.slotmust useas constto preserve the literal type
Every plugin default-exports a PluginModule<T>:
import type { PluginModule, Runtime } from "@aoagents/ao-core";
export const manifest = {
name: "tmux",
slot: "runtime" as const,
description: "tmux session runtime",
version: "0.1.0",
};
export function create(config?: Record<string, unknown>): Runtime {
// Validate config here, not in individual methods
// Use closure to capture validated config
return { ... };
}
// Optional: check if binary/dependency is available on system
export function detect(): boolean { ... }
export default { manifest, create, detect } satisfies PluginModule<Runtime>;- Plugin-level config comes via
create(config)from the YAML notifier/tracker blocks - Project-level config (e.g.
agentConfig,trackerConfig) is passed to individual methods - Validate in
create(), store via closure — don't re-validate per call - Warn (don't throw) for missing optional config during plugin load
- Throw with descriptive message when a required config is missing at method call time
- Wrap errors with
causefor debugging:throw new Error("msg", { cause: err }) - Return
nullfor "not found" (e.g. tracker issue lookup), throw for unexpected errors - Never silently swallow errors
- Use
shellEscape()from core for all command arguments (prevent injection)
- All I/O methods return
Promise<T>(async-first) - Plugins are loosely coupled — communicate through Session object and Lifecycle Manager, never call other plugins directly
- Implement
destroy()/ cleanup with best-effort semantics
import {
shellEscape, // Safe command argument escaping
validateUrl, // Webhook URL validation
readLastJsonlEntry, // Efficient JSONL log tail (native agent JSONL)
readLastActivityEntry, // Read last AO activity JSONL entry
checkActivityLogState, // Extract waiting_input/blocked from AO JSONL (with staleness cap)
getActivityFallbackState, // Last-resort fallback: entry state + age-based decay
recordTerminalActivity, // Shared recordActivity impl (classify + dedup + append)
classifyTerminalActivity, // Classify terminal output via detectActivity
appendActivityEntry, // Low-level JSONL append
setupPathWrapperWorkspace, // Install ~/.ao/bin wrappers + .ao/AGENTS.md
buildAgentPath, // Prepend ~/.ao/bin to PATH
normalizeAgentPermissionMode, // Normalize permission mode strings
DEFAULT_READY_THRESHOLD_MS, // 5 min — ready→idle threshold
DEFAULT_ACTIVE_WINDOW_MS, // 30s — active→ready window
ACTIVITY_INPUT_STALENESS_MS, // 5 min — waiting_input/blocked expiry
PREFERRED_GH_PATH, // /usr/local/bin/gh
CI_STATUS, ACTIVITY_STATE, SESSION_STATUS, // Constants
type Session, type ProjectConfig, type RuntimeHandle,
} from "@aoagents/ao-core";- Vitest in
src/__tests__/index.test.ts - Mock external CLIs, file I/O, HTTP calls
- Test manifest values,
create()return shape, all public methods, and error paths - Use
beforeEachto reset mocks
- Hardcoded secrets -> use
process.env, throw if missing - Shell injection -> use
shellEscape()for all arguments - Large file reads -> use streaming or
readLastJsonlEntry() - Config validation in methods -> validate once in
create(), closure the rest
All agent plugins (claude-code, codex, aider, opencode, etc.) must implement the full Agent interface. The dashboard depends on these methods for PR tracking, cost display, and session resume.
Required methods (all agents):
| Method | Purpose | Return null OK? |
|---|---|---|
getLaunchCommand |
Shell command to start the agent | No |
getEnvironment |
Env vars for agent process (must include ~/.ao/bin in PATH) |
No |
detectActivity |
Terminal output classification (deprecated, but required) | No |
getActivityState |
JSONL/API-based activity detection (min 3 states: active/ready/idle) | Yes (if no data) |
isProcessRunning |
Check process alive via tmux TTY or PID | No |
getSessionInfo |
Extract summary, cost, session ID from agent's data | Yes (if agent has no introspection) |
Optional methods (implement when the agent supports it):
| Method | Purpose | When to skip |
|---|---|---|
getRestoreCommand |
Resume a previous session | Agent has no resume capability (return null) |
setupWorkspaceHooks |
Install metadata-update hooks (PATH wrappers or agent-native) | Never — required for dashboard PR tracking |
postLaunchSetup |
Post-launch config (re-ensure hooks, resolve binary) | Only if no post-launch work needed |
recordActivity |
Write terminal-derived activity to JSONL for getActivityState |
Agent has native JSONL with full state coverage (Claude Code). Codex implements it as a safety net for when its native JSONL is missing/unparseable. |
Metadata hooks are critical. Without setupWorkspaceHooks, PRs created by agents won't appear in the dashboard. Two patterns exist:
- Agent-native hooks (Claude Code): PostToolUse hooks in
.claude/settings.json - PATH wrappers (Codex, Aider, OpenCode):
~/.ao/bin/ghand~/.ao/bin/gitintercept commands. CallsetupPathWrapperWorkspace(workspacePath)— it installs wrappers to~/.ao/bin/and writes session context to.ao/AGENTS.md(gitignored, does not modify tracked files).
Environment requirements:
- All agents must set
AO_SESSION_IDand optionallyAO_ISSUE_ID - All agents using PATH wrappers must prepend
~/.ao/binto PATH - Use
normalizeAgentPermissionModefrom@aoagents/ao-core(not a local duplicate)
Activity detection architecture:
getActivityState is the most critical method in the agent plugin. The dashboard, lifecycle manager, and stuck-detection all depend on it returning correct states. Every agent plugin must produce all 6 states over its lifetime:
spawning → active ↔ ready → idle → exited
↘ waiting_input / blocked ↗
| State | Meaning | When |
|---|---|---|
active |
Agent is working right now | Activity within last 30s |
ready |
Agent finished recently, may resume | 30s–5min since last activity |
idle |
Agent has been quiet for a while | >5min since last activity |
waiting_input |
Agent is blocked on user approval | Permission prompt visible |
blocked |
Agent hit an error it can't recover from | Error state detected |
exited |
Process is dead | isProcessRunning returns false |
The getActivityState contract — implement exactly this cascade:
async getActivityState(session, readyThresholdMs?): Promise<ActivityDetection | null> {
// 1. PROCESS CHECK — always first
if (!running) return { state: "exited", timestamp };
// 2. ACTIONABLE STATES — check for waiting_input/blocked
// Source: native JSONL (Claude Code, Codex) OR AO activity JSONL (others)
// These are the only states checkActivityLogState() surfaces.
// If found, return immediately.
// 3. NATIVE SIGNAL — agent-specific API for timestamp (preferred)
// Source: agent's session list API, native JSONL timestamps, etc.
// Classify by age: active (<30s) / ready (30s–threshold) / idle (>threshold)
// 4. JSONL ENTRY FALLBACK — always implement this
// Source: getActivityFallbackState(activityResult, activeWindowMs, threshold)
// Uses the entry's detected state + entry.ts for age-based decay.
// Decay only demotes (active→ready→idle), never promotes.
// This is the SAFETY NET when the native signal is unavailable.
// Without this, getActivityState returns null and the dashboard shows
// no activity for the entire session lifetime.
// 5. Return null only if there is genuinely no data at all.
}Step 4 is mandatory. If you skip the JSONL entry fallback, getActivityState will return null whenever the native API fails (binary not in PATH, API changed, session not found, timeout). The dashboard will show no activity state and stuck-detection breaks. This was a real bug in the OpenCode plugin — findOpenCodeSession returned null due to a session creation issue, and without the fallback, the entire active/ready/idle flow was dead. Use getActivityFallbackState() from core — it handles age-based decay and staleness caps correctly.
Two activity detection patterns exist:
| Pattern | Used by | How it works |
|---|---|---|
| Native JSONL | Claude Code, Codex | Agent writes its own JSONL with rich state (permission_request, tool_call, error, etc.). getActivityState reads the last entry and maps it to activity states. |
| AO Activity JSONL | Aider, OpenCode, new agents | Agent implements recordActivity. Lifecycle manager calls it each poll cycle with terminal output. It calls classifyTerminalActivity() → appendActivityEntry() to write to {workspacePath}/.ao/activity.jsonl. getActivityState reads from this file. |
For agents using AO Activity JSONL (the common case for new plugins):
- Implement
recordActivity— delegate to the sharedrecordTerminalActivity():
async recordActivity(session: Session, terminalOutput: string): Promise<void> {
if (!session.workspacePath) return;
await recordTerminalActivity(session.workspacePath, terminalOutput, (output) =>
this.detectActivity(output),
);
}recordTerminalActivity handles classification, deduplication (20s window for non-actionable states), and appending. You don't need to implement dedup yourself.
- Implement
detectActivitywith patterns specific to the agent's terminal output:
detectActivity(terminalOutput: string): ActivityState {
// Match the ACTUAL prompts/patterns the agent emits.
// Test with real terminal output — don't guess patterns.
// Return: "idle" | "active" | "waiting_input" | "blocked"
}- In
getActivityState, usecheckActivityLogState()for waiting_input/blocked, then fall back togetActivityFallbackState():
// checkActivityLogState returns non-null ONLY for waiting_input/blocked.
// active/idle/ready intentionally return null — use the fallback for those.
const activityResult = await readLastActivityEntry(session.workspacePath);
const activityState = checkActivityLogState(activityResult);
if (activityState) return activityState;
// ... try native signal first (session list API, git commits, etc.) ...
// JSONL entry fallback (REQUIRED — do not skip)
const activeWindowMs = Math.min(DEFAULT_ACTIVE_WINDOW_MS, threshold);
const fallback = getActivityFallbackState(activityResult, activeWindowMs, threshold);
if (fallback) return fallback;getActivityFallbackState uses the entry's detected state with age-based decay (active→ready→idle) and respects the entry state as a ceiling (never promotes idle to active). Stale waiting_input/blocked entries (>5min) decay to idle.
Required tests for getActivityState — all agent plugins must have these:
- Returns
exitedwhen process is not running - Returns
waiting_inputfrom JSONL when agent is at a permission prompt - Returns
blockedfrom JSONL when agent hit an error - Returns
activefrom native signal when agent was recently active - Returns
activefrom JSONL entry fallback when native signal fails (fresh entry) - Returns
idlefrom JSONL entry fallback when native signal fails (old entry with age decay) - Returns
nullwhen both native signal and JSONL are unavailable
isProcessRunning must:
- Support tmux runtime (TTY-based
pslookup with process name regex) - Support process runtime (PID signal-0 check with EPERM handling)
- Match BOTH the node wrapper name AND the actual binary name (some agents install as
.agentnamewith a dot prefix — the regex must handle this) - Return
false(notnull) on error
- C-01: No new UI component libraries
- C-02: No inline styles in new/modified code
- C-04: Component files max 400 lines
- C-05: Dark theme preserved (no redesign)
- C-06: Next.js App Router only
- C-07: No animation libraries
- C-12: Test files for all new components
- C-13: pnpm
workspace:*protocol for cross-package deps - C-14: SSE 5s interval unchanged