ADR-007: LLM Integration Architecture
Status
Implemented (Phases 1-4 Complete as of v4.15.0 - 2026-02-03)
Implementation Status
| Phase | Version | Status | Description |
|---|---|---|---|
| Phase 1 | v4.11.0 | ✅ Complete | Agent Skills, MCP ReadOnly tools, Chat UI |
| Phase 1B | v4.11.0 | ✅ Complete | ReadOnly MCP tools for state visibility |
| Phase 2 | v4.12.0 | ✅ Complete | Plan/Apply workflow, TOCTOU protection |
| Phase 3 | v4.13.0-v4.15.0 | ✅ Complete | All 5 providers, real SSE streaming, batch ops, conversation persistence (P3-05), cost tracking UI (P3-06) |
| Phase 4 | v4.14.0 | ✅ Complete | HardwareIO tier, audit logging, rate limiting, undo/redo |
| Phase 5 | Future | Pending | OAuth2 remote access, A2A protocol, conversation optimization |
Original Approval: LLM Council Reviewed - 2026-01-31
Context
Conductor is a mature multi-protocol input mapping system with a Tauri GUI, daemon infrastructure, and comprehensive configuration capabilities. Users currently configure devices, modes, mappings, profiles, and plugins through the GUI or by editing TOML files directly.
Problem Statement
- Complex Configuration: Setting up sophisticated mappings requires understanding trigger types, action parameters, velocity curves, and conditional logic
- No Natural Language Interface: Users cannot describe their desired behavior in plain language
- Limited Automation: No way for AI assistants to help configure the system
- Fragmented Tooling: External LLMs cannot interact with a running Conductor daemon
Requirements
R1: Integrate a chat interface into the GUI enabling natural language discussion of requirements and mappings
R2: Enable LLMs to manage MIDI capture, and configure devices, modes, mappings, profiles, plugins, and settings
R3: Support multiple LLM providers: OpenAI, Anthropic, Google, and routers like OpenRouter and LiteLLM
R4: Enable external LLMs to control Conductor via MCP (Model Context Protocol), A2A (Agent-to-Agent), and skills.md
R5: Maintain security boundaries - LLMs should not have unrestricted shell access
Decision
Skills vs MCP: Complementary Architectures
Conductor implements both Agent Skills and MCP because they serve fundamentally different purposes:
| Aspect | Agent Skills (SKILL.md) | MCP Tools |
|---|---|---|
| Purpose | Knowledge & expertise ("how to") | Capabilities & actions ("what to do") |
| Analogy | "Brain and playbook" | "Arms and legs" |
| Token Cost | ~100 tokens (metadata) to ~5K (full skill) | 10K-50K+ tokens (tool schemas) |
| Security | Prompt-based, sandboxed by agent | Requires auth, API boundaries |
| Cross-platform | Claude Code, Copilot, Cursor, Codex | MCP-compatible clients |
| Example | "How to create a velocity-sensitive mapping" | conductor_create_mapping(...) |
Why both? "Skills without MCP are well-written instructions. MCP without Skills is raw power with no guidance." Skills teach the agent Conductor's domain concepts (triggers, actions, patterns), while MCP provides the actual execution capabilities.
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ CONDUCTOR ECOSYSTEM │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ CONDUCTOR GUI (Tauri) │ │
│ │ ┌─────────────────┐ ┌─────────────────────────────────────────┐ │ │
│ │ │ Chat Panel │ │ Configuration Views │ │ │
│ │ │ ┌───────────┐ │ │ Devices │ Modes │ Mappings │ Profiles │ │ │
│ │ │ │ Messages │ │ └─────────────────────────────────────────┘ │ │
│ │ │ │ [User] │ │ ↑ │ │
│ │ │ │ [LLM] │ │ │ Config Updates │ │
│ │ │ │ [System] │ │ ┌─────────────────┴───────────────────────┐ │ │
│ │ │ └───────────┘ │ │ LLM Agent Controller │ │ │
│ │ │ ┌───────────┐ │ │ - Tool execution │ │ │
│ │ │ │ Input │◄─┼───►│ - Config generation │ │ │
│ │ │ │ [Send] │ │ │ - MIDI Learn coordination │ │ │
│ │ │ └───────────┘ │ │ - Validation & preview │ │ │
│ │ └─────────────────┘ └─────────────────────────────────────────┘ │ │
│ │ │ │ │ │
│ │ │ Tauri Events │ Tauri Commands │ │
│ │ ▼ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
│ │ │ LLM Provider Abstraction │ │ │
│ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │ │ │
│ │ │ │ OpenAI │ │Anthropic │ │ Google │ │ OpenRouter/Lite │ │ │ │
│ │ │ │ GPT-4 │ │ Claude │ │ Gemini │ │ LLM Router │ │ │ │
│ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────────────────┘ │ │ │
│ │ └─────────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ IPC (Unix Socket) │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ CONDUCTOR DAEMON │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐│ │
│ │ │ Engine Manager ││ │
│ │ │ - Config management - Event processing ││ │
│ │ │ - MIDI Learn mode - Action execution ││ │
│ │ │ - Device management - Mode switching ││ │
│ │ └─────────────────────────────────────────────────────────────────┘│ │
│ │ │ │ │
│ │ ┌───────────────────────────┴───────────────────────────────────┐ │ │
│ │ │ MCP Server (New) │ │ │
│ │ │ - Tool exposure for external LLMs │ │ │
│ │ │ - A2A protocol support │ │ │
│ │ │ - skills.md generation │ │ │
│ │ └───────────────────────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ ▲ │
│ │ MCP/A2A/HTTP │
│ │ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ EXTERNAL LLM CLIENTS │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
│ │ │ Claude Code │ │ Cursor/Cody │ │ Custom MCP Clients │ │ │
│ │ │ (via MCP) │ │ (via MCP) │ │ (A2A Protocol) │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
Component Design
1. LLM Provider Abstraction Layer
Location: conductor-gui/src-tauri/src/llm/
/// Unified trait for LLM providers
#[async_trait]
pub trait LLMProvider: Send + Sync {
/// Send a message and get a response (with tool use support)
async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, LLMError>;
/// Stream a response for real-time display
async fn chat_stream(&self, request: ChatRequest) -> Result<ChatStream, LLMError>;
/// Check provider health/availability
async fn health_check(&self) -> Result<ProviderStatus, LLMError>;
/// Get provider capabilities (tool use, vision, etc.)
fn capabilities(&self) -> ProviderCapabilities;
}
/// Provider configuration
#[derive(Clone, Serialize, Deserialize)]
pub struct ProviderConfig {
pub provider_type: ProviderType,
pub api_key: SecretString, // Encrypted in config
pub base_url: Option<String>, // For OpenRouter/LiteLLM
pub model: String,
pub max_tokens: u32,
pub temperature: f32,
}
#[derive(Clone, Serialize, Deserialize)]
pub enum ProviderType {
OpenAI,
Anthropic,
Google,
OpenRouter,
LiteLLM,
Custom { base_url: String },
}
Supported Providers:
| Provider | Models | Tool Use | Streaming | Notes |
|---|---|---|---|---|
| OpenAI | GPT-4, GPT-4o | Yes | Yes | Native function calling |
| Anthropic | Claude 3.5, Claude 4 | Yes | Yes | Native tool use |
| Gemini Pro, Ultra | Yes | Yes | Function declarations | |
| OpenRouter | 100+ models | Varies | Yes | Unified API, model routing |
| LiteLLM | Local/Proxy | Varies | Yes | Self-hosted, privacy-focused |
Capability Negotiation (LLM Council Recommendation):
Runtime detection of provider capabilities enables graceful degradation when features aren't available:
/// Provider capabilities detected at runtime
#[derive(Clone, Default, Serialize, Deserialize)]
pub struct ProviderCapabilities {
/// Supports tool/function calling
pub tool_use: bool,
/// Supports streaming responses
pub streaming: bool,
/// Supports vision/image input
pub vision: bool,
/// Maximum context window (tokens)
pub max_context: u32,
/// Maximum output tokens
pub max_output: u32,
/// Supports structured JSON output
pub json_mode: bool,
/// Cost per 1K input tokens (for tracking)
pub input_cost_per_1k: Option<f64>,
/// Cost per 1K output tokens
pub output_cost_per_1k: Option<f64>,
}
impl LLMProvider {
/// Probe provider to detect actual capabilities
async fn detect_capabilities(&self) -> ProviderCapabilities {
// Try a test message with tool use
let tool_use = self.test_tool_call().await.is_ok();
// Check streaming support
let streaming = self.test_streaming().await.is_ok();
// Get model info from provider API if available
let model_info = self.get_model_info().await.ok();
ProviderCapabilities {
tool_use,
streaming,
vision: model_info.map(|i| i.vision).unwrap_or(false),
max_context: model_info.map(|i| i.context_window).unwrap_or(4096),
max_output: model_info.map(|i| i.max_output).unwrap_or(4096),
json_mode: model_info.map(|i| i.json_mode).unwrap_or(false),
input_cost_per_1k: model_info.and_then(|i| i.pricing.input),
output_cost_per_1k: model_info.and_then(|i| i.pricing.output),
}
}
}
/// Graceful degradation when capabilities are missing
pub struct LLMAgentController {
provider: Box<dyn LLMProvider>,
capabilities: ProviderCapabilities,
}
impl LLMAgentController {
pub async fn chat(&self, request: ChatRequest) -> Result<ChatResponse, LLMError> {
if !self.capabilities.tool_use {
// Fall back to instruction-based prompting
return self.chat_without_tools(request).await;
}
if !self.capabilities.streaming && request.stream {
// Fall back to non-streaming with progress simulation
return self.chat_with_simulated_streaming(request).await;
}
self.provider.chat(request).await
}
}
2. Tool System for LLM Actions
Location: conductor-gui/src-tauri/src/llm/tools.rs
/// Tools exposed to the LLM for Conductor control
pub enum ConductorTool {
// Device Management
ListDevices,
ConnectDevice { port: usize },
DisconnectDevice,
// MIDI Learning
StartMidiLearn,
StopMidiLearn,
GetCapturedEvents,
// Configuration
GetConfig,
GetModes,
CreateMode { name: String, color: Option<String> },
DeleteMode { name: String },
// Mapping Management
GetMappings { mode: String },
CreateMapping { mode: String, trigger: Trigger, action: Action },
UpdateMapping { mode: String, index: usize, trigger: Trigger, action: Action },
DeleteMapping { mode: String, index: usize },
// Profile Management
ListProfiles,
CreateProfile { name: String, bundle_id: String },
UpdateProfile { id: String, mappings: Vec<Mapping> },
DeleteProfile { id: String },
// Settings
GetSettings,
UpdateSettings { settings: AdvancedSettings },
// Validation & Preview
ValidateConfig { config: Config },
PreviewAction { action: Action },
// Plugin Management
ListPlugins,
EnablePlugin { name: String },
DisablePlugin { name: String },
}
/// Tool execution result
pub struct ToolResult {
pub success: bool,
pub data: Option<serde_json::Value>,
pub error: Option<String>,
pub suggestions: Vec<String>, // Follow-up suggestions for LLM
}
Tool Risk Tiers (LLM Council Recommendation):
Classify tools by impact level for graduated safety controls:
/// Tool risk classification for graduated safety controls
#[derive(Clone, Copy, Debug, Serialize, Deserialize)]
pub enum ToolRiskTier {
/// Read-only operations - safe to auto-execute
/// Examples: GetConfig, ListDevices, GetMappings
ReadOnly,
/// Stateful but reversible operations (session-scoped)
/// Examples: StartMidiLearn, StopMidiLearn, ConnectDevice
/// Note: Creates temporary server-side state that does NOT persist
Stateful,
/// Configuration changes - require Plan/Apply pattern
/// Examples: CreateMapping, UpdateMapping, DeleteMapping
ConfigChange,
/// Hardware I/O operations - CRITICAL RISK (LLM Council Addition)
/// Examples: SendRawSysEx, FirmwareUpdate, DeviceReset
/// WARNING: SysEx messages can brick hardware! Requires multi-step confirmation.
HardwareIO,
/// Privileged operations - require explicit user confirmation
/// Examples: DeleteProfile, ResetConfig, ExportSecrets
Privileged,
}
impl ConductorTool {
pub fn risk_tier(&self) -> ToolRiskTier {
match self {
// ReadOnly - auto-execute
Self::GetConfig | Self::GetModes | Self::GetMappings { .. }
| Self::ListDevices | Self::ListProfiles | Self::ListPlugins
| Self::GetSettings | Self::GetCapturedEvents => ToolRiskTier::ReadOnly,
// Stateful - auto-execute with logging
Self::StartMidiLearn | Self::StopMidiLearn
| Self::ConnectDevice { .. } | Self::DisconnectDevice => ToolRiskTier::Stateful,
// ConfigChange - require Plan/Apply
Self::CreateMode { .. } | Self::DeleteMode { .. }
| Self::CreateMapping { .. } | Self::UpdateMapping { .. }
| Self::DeleteMapping { .. } | Self::CreateProfile { .. }
| Self::UpdateProfile { .. } | Self::UpdateSettings { .. } => ToolRiskTier::ConfigChange,
// Privileged - explicit confirmation
Self::DeleteProfile { .. } | Self::ValidateConfig { .. }
| Self::PreviewAction { .. } | Self::EnablePlugin { .. }
| Self::DisablePlugin { .. } => ToolRiskTier::Privileged,
}
}
}
Plan/Apply Pattern (LLM Council Critical Recommendation):
For ConfigChange and Privileged operations, use a Plan/Apply pattern where the LLM proposes changes, the UI displays a diff, and the user explicitly confirms before application:
User: "Create a mapping so pad 36 triggers Cmd+C"
│
▼
┌─────────────────────────────────────────┐
│ LLM Agent Controller │
│ 1. Parse user intent │
│ 2. Generate proposed config change │
│ 3. Return Plan (not applied yet) │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Plan Preview UI │
│ ┌─────────────────────────────────┐ │
│ │ + [[modes.mappings]] │ │
│ │ + trigger = { type = "Note", │ │
│ │ + note = 36 } │ │
│ │ + action = { type = "Keystroke"│ │
│ │ + keys = ["cmd","c"]│ │
│ │ + } │ │
│ └─────────────────────────────────┘ │
│ [Cancel] [Apply Changes] │
└─────────────────────────────────────────┘
│
User clicks [Apply]
│
▼
┌─────────────────────────────────────────┐
│ Tool Executor │
│ 1. Apply validated changes │
│ 2. Emit state update events │
│ 3. Return confirmation │
└─────────────────────────────────────────┘
Plan/Apply Data Structures:
/// A proposed configuration change
#[derive(Clone, Serialize, Deserialize)]
pub struct ConfigPlan {
pub id: Uuid,
pub description: String,
pub changes: Vec<ConfigChange>,
pub created_at: DateTime<Utc>,
pub expires_at: DateTime<Utc>, // Auto-expire after 5 minutes
pub base_state_hash: String, // CRITICAL: Hash of config at plan time (TOCTOU protection)
pub touched_files: Vec<PathBuf>, // All files that will be modified
}
#[derive(Clone, Serialize, Deserialize)]
pub enum ConfigChange {
CreateMapping { mode: String, mapping: Mapping },
UpdateMapping { mode: String, index: usize, old: Mapping, new: Mapping },
DeleteMapping { mode: String, index: usize, mapping: Mapping },
CreateMode { mode: Mode },
DeleteMode { name: String, mode: Mode },
UpdateSettings { old: Settings, new: Settings },
// ... other changes
}
impl ConfigPlan {
/// Generate human-readable diff for UI display
pub fn to_diff(&self) -> String { ... }
/// Validate plan is still applicable (TOCTOU protection)
pub fn validate(&self, current_config: &Config) -> Result<(), PlanError> {
let current_hash = current_config.compute_hash();
if current_hash != self.base_state_hash {
return Err(PlanError::StaleState {
message: "Configuration changed since plan was created. Please re-run plan.".into(),
plan_hash: self.base_state_hash.clone(),
current_hash,
});
}
// Additional validation...
Ok(())
}
}
Plan Expiration Policy (LLM Council Recommendation):
const DEFAULT_PLAN_TTL: Duration = Duration::from_secs(300); // 5 minutes
impl ConfigPlan {
pub fn is_expired(&self) -> bool {
Utc::now() > self.expires_at
}
}
- Plans expire after 5 minutes by default (configurable via
CONDUCTOR_PLAN_TTL_SECONDS) - Expired plan apply returns clear error message
- Rationale: Long-lived plans create false confidence in stale state
**Tool Execution Flow**:
User Message: "Create a mapping so pad 36 triggers Cmd+C" │ ▼ ┌─────────────────────────────────────────┐ │ LLM Agent Controller │ │ 1. Parse user intent │ │ 2. Select tools: StartMidiLearn or │ │ CreateMapping │ │ 3. Generate tool call parameters │ └─────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ Transport-Agnostic Tool Executor │ │ 1. Validate parameters │ │ 2. Check tool risk tier │ │ 3. For ReadOnly/Stateful: Execute │ │ 4. For ConfigChange: Return Plan │ │ 5. Emit Tauri Events for state sync │ └─────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ LLM Response │ │ "I've prepared a mapping for pad 36 │ │ to trigger Cmd+C. Please review the │ │ changes above and click Apply." │ └─────────────────────────────────────────┘
#### 3. Transport-Agnostic Tool Executor
**Location**: `conductor-gui/src-tauri/src/llm/executor.rs`
A centralized tool executor that works identically whether invoked from the GUI chat or via MCP:
```rust
/// Centralized tool executor - same logic for GUI and MCP
pub struct ToolExecutor {
config_manager: Arc<RwLock<ConfigManager>>,
device_manager: Arc<RwLock<DeviceManager>>,
midi_learn: Arc<RwLock<MidiLearnState>>,
event_emitter: TauriEventEmitter,
pending_plans: Arc<RwLock<HashMap<Uuid, ConfigPlan>>>,
}
impl ToolExecutor {
/// Execute a tool call with risk-tier-appropriate handling
pub async fn execute(&self, tool: ConductorTool) -> ToolResult {
let risk_tier = tool.risk_tier();
// Log all tool executions for audit
self.log_tool_execution(&tool, risk_tier).await;
match risk_tier {
ToolRiskTier::ReadOnly => {
// Execute immediately, return data
self.execute_readonly(tool).await
}
ToolRiskTier::Stateful => {
// Execute, emit state change event
let result = self.execute_stateful(tool).await;
self.emit_state_change(&result).await;
result
}
ToolRiskTier::ConfigChange => {
// Generate plan, require user approval
let plan = self.generate_plan(tool).await?;
self.pending_plans.write().await.insert(plan.id, plan.clone());
self.emit_plan_ready(&plan).await;
ToolResult::plan_pending(plan)
}
ToolRiskTier::Privileged => {
// Generate plan with explicit warning
let plan = self.generate_plan_with_warning(tool).await?;
self.pending_plans.write().await.insert(plan.id, plan.clone());
self.emit_plan_ready(&plan).await;
ToolResult::plan_pending_privileged(plan)
}
}
}
/// Apply a previously generated plan (called after user approval)
pub async fn apply_plan(&self, plan_id: Uuid) -> Result<ToolResult, PlanError> {
let plan = self.pending_plans.write().await.remove(&plan_id)
.ok_or(PlanError::NotFound)?;
// Validate plan is still applicable
let config = self.config_manager.read().await;
plan.validate(&config)?;
drop(config);
// Apply all changes atomically
self.apply_changes(&plan.changes).await?;
// Emit config updated event
self.emit_config_updated().await;
Ok(ToolResult::success_with_message(
format!("Applied {} changes", plan.changes.len())
))
}
/// Reject a pending plan
pub async fn reject_plan(&self, plan_id: Uuid) -> Result<(), PlanError> {
self.pending_plans.write().await.remove(&plan_id);
Ok(())
}
}
Wrapper for MCP:
/// MCP handler wraps the same ToolExecutor
pub struct McpToolHandler {
executor: Arc<ToolExecutor>,
}
impl McpToolHandler {
pub async fn handle_tool_call(&self, tool_name: &str, params: Value) -> McpResult {
let tool = ConductorTool::from_mcp(tool_name, params)?;
let result = self.executor.execute(tool).await;
result.to_mcp_response()
}
}
4. State Synchronization via Tauri Events
Location: conductor-gui/src-tauri/src/llm/events.rs
Use Tauri's event system for real-time state synchronization between the LLM agent, GUI, and any external clients:
/// Events emitted by the LLM system
pub enum LlmEvent {
/// Chat message received/sent
ChatMessage { message: ChatMessage },
/// Tool execution started
ToolExecutionStarted { tool: String, params: Value },
/// Tool execution completed
ToolExecutionCompleted { tool: String, result: ToolResult },
/// Config plan ready for review
PlanReady { plan: ConfigPlan },
/// Config plan applied
PlanApplied { plan_id: Uuid, changes_count: usize },
/// Config plan rejected
PlanRejected { plan_id: Uuid },
/// Configuration updated (sync all views)
ConfigUpdated { config: Config },
/// MIDI Learn state changed
MidiLearnStateChanged { state: MidiLearnState },
/// LLM streaming token
StreamingToken { token: String },
/// LLM provider status changed
ProviderStatusChanged { provider: String, status: ProviderStatus },
}
/// Event emitter for Tauri
pub struct TauriEventEmitter {
app_handle: AppHandle,
}
impl TauriEventEmitter {
pub fn emit(&self, event: LlmEvent) {
let event_name = event.event_name();
let payload = serde_json::to_value(&event).unwrap();
self.app_handle.emit_all(&event_name, payload).ok();
}
}
Frontend Event Handling:
<!-- ChatView.svelte -->
<script>
import { listen } from '@tauri-apps/api/event';
import { onMount, onDestroy } from 'svelte';
let unlisten = [];
onMount(async () => {
// Listen for plan ready events
unlisten.push(await listen('llm:plan-ready', (event) => {
showPlanReviewModal(event.payload.plan);
}));
// Listen for config updates (sync all views)
unlisten.push(await listen('llm:config-updated', (event) => {
configStore.set(event.payload.config);
}));
// Listen for streaming tokens
unlisten.push(await listen('llm:streaming-token', (event) => {
chatStore.appendToLastMessage(event.payload.token);
}));
});
onDestroy(() => {
unlisten.forEach(fn => fn());
});
</script>
5. MCP Server for External LLM Access (Priority)
Location: conductor-daemon/src/mcp/
The daemon exposes an MCP (Model Context Protocol) server enabling external LLMs (Claude Code, Cursor, etc.) to control Conductor.
/// MCP Server configuration
pub struct McpServerConfig {
pub enabled: bool,
pub socket_path: PathBuf, // Unix socket for local access
pub http_port: Option<u16>, // Optional HTTP for remote (with auth)
pub auth_token: Option<SecretString>,
pub allowed_tools: Vec<String>, // Tool allowlist
}
/// MCP Tool definitions exposed to external clients
pub fn get_mcp_tools() -> Vec<McpTool> {
vec![
McpTool {
name: "conductor_get_status",
description: "Get Conductor daemon status including device connection and lifecycle state",
parameters: json!({}),
},
McpTool {
name: "conductor_list_devices",
description: "List available MIDI and gamepad devices",
parameters: json!({}),
},
McpTool {
name: "conductor_connect_device",
description: "Connect to a specific MIDI device by port index",
parameters: json!({
"type": "object",
"properties": {
"port": { "type": "integer", "description": "MIDI port index" }
},
"required": ["port"]
}),
},
McpTool {
name: "conductor_get_config",
description: "Get the current Conductor configuration including modes and mappings",
parameters: json!({}),
},
McpTool {
name: "conductor_create_mapping",
description: "Create a new mapping in the specified mode",
parameters: json!({
"type": "object",
"properties": {
"mode": { "type": "string" },
"trigger": { "$ref": "#/definitions/Trigger" },
"action": { "$ref": "#/definitions/Action" }
},
"required": ["mode", "trigger", "action"]
}),
},
// ... additional tools
]
}
MCP Protocol Flow:
External LLM (e.g., Claude Code)
│
│ MCP Initialize
▼
┌─────────────────────────────────────────┐
│ Conductor MCP Server │
│ 1. Authenticate (if configured) │
│ 2. Return tool manifest │
│ 3. Handle tool calls │
└─────────────────────────────────────────┘
│
│ Tool Results
▼
External LLM formulates response to user
6. A2A (Agent-to-Agent) Protocol Support (Deferred to Phase 4+)
LLM Council Recommendation: Defer A2A implementation to focus on MCP first. MCP has broader adoption and Claude Code support. A2A can be added later when the protocol stabilizes.
For future agent interoperability beyond MCP, support the emerging A2A protocol:
/// A2A Agent Card (discovery)
pub struct AgentCard {
pub name: String, // "Conductor MIDI Mapper"
pub description: String,
pub url: String, // Agent endpoint
pub capabilities: Vec<Capability>,
pub authentication: AuthMethod,
pub skills: Vec<Skill>,
}
/// A2A Task execution
pub struct A2ATask {
pub id: String,
pub skill: String,
pub input: serde_json::Value,
pub context: Option<TaskContext>,
}
/// Skills exposed via A2A
pub fn get_a2a_skills() -> Vec<Skill> {
vec![
Skill {
name: "configure_midi_mapping",
description: "Configure a MIDI controller mapping from natural language",
input_schema: json!({
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Natural language description of desired mapping"
}
}
}),
},
Skill {
name: "capture_midi_events",
description: "Start MIDI capture and return detected events",
input_schema: json!({
"type": "object",
"properties": {
"duration_secs": { "type": "integer", "default": 30 }
}
}),
},
]
}
7. Agent Skills (SKILL.md Standard)
Important Distinction: Agent Skills and MCP serve complementary roles:
- Skills = Knowledge packages ("the brain and playbook") - teach how to do things
- MCP Tools = Capability interfaces ("the arms and legs") - enable what can be done
"Skills without MCP are well-written instructions. MCP without Skills is raw power with no guidance." — Agent Skills vs MCP comparison
Conductor provides Agent Skills following the agentskills.io specification for cross-platform compatibility with Claude Code, VS Code Copilot, Cursor, Codex CLI, and other MCP-compatible clients.
Location: ~/.conductor/skills/ (user-installable) or bundled with Conductor
Skill Directory Structure
~/.conductor/skills/
├── conductor-midi-mapping/
│ ├── SKILL.md # Required: instructions + metadata
│ ├── references/
│ │ ├── TRIGGERS.md # Detailed trigger type reference
│ │ ├── ACTIONS.md # Detailed action type reference
│ │ └── EXAMPLES.md # Common mapping patterns
│ └── scripts/
│ └── validate_mapping.py
├── conductor-midi-learn/
│ ├── SKILL.md
│ └── references/
│ └── PATTERNS.md # Pattern detection guide
├── conductor-device-setup/
│ ├── SKILL.md
│ └── references/
│ └── DEVICES.md # Supported devices
└── conductor-profile-manager/
└── SKILL.md
Primary Skill: conductor-midi-mapping/SKILL.md
LLM Council Guidance: Skills should teach judgment and decision-making, not just procedures. Focus on mental models, decision frameworks, and failure modes.
---
name: conductor-midi-mapping
description: >
Create and manage MIDI controller mappings for Conductor. Use when the user
wants to configure what happens when they press pads, turn knobs, or move
faders on their MIDI controller. Handles triggers (Note, VelocityRange,
LongPress, DoubleTap, Chord, Encoder, CC) and actions (Keystroke, Launch,
Shell, SendMIDI, ModeChange, Sequence).
license: Apache-2.0
compatibility: Requires Conductor daemon running with MCP server enabled
metadata:
author: amiable
version: "4.11.0"
category: midi
allowed-tools: Bash(conductor:*) Read Write
---
# MIDI Mapping Configuration
You help users create MIDI controller mappings for Conductor, translating natural
language descriptions into precise trigger/action configurations.
## Scope & Non-Goals
**This skill covers:**
- Creating, modifying, and deleting MIDI mappings
- Understanding trigger types and when to use each
- Selecting appropriate actions for user intent
**This skill does NOT cover:**
- OS-level MIDI driver configuration (direct user to system preferences)
- DAW-specific scripting (Ableton Live, Logic Pro internal scripting)
- Hardware firmware updates (NEVER attempt this)
- Raw SysEx message construction (requires explicit HardwareIO tier approval)
## Safety Policy
**CRITICAL RULES - Never violate these:**
1. **Never modify configuration without Plan/Apply** - Always generate a plan first
2. **Never assume MIDI note/CC numbers** - Ask or use MIDI Learn to discover
3. **Never send raw SysEx without explicit user confirmation** - Can brick hardware
4. **Never execute shell commands from user input without sanitization**
## Core Mental Model
MIDI mappings are **routing rules** with optional **transforms**. Think of them as:
Source (what hardware sends) → Transform (how to interpret) → Target (what to control)
### Decision Framework
When creating mappings, resolve these questions **in order**:
**1. Source Identification**
- What device? (May require `conductor_list_devices` if user is vague)
- What message type? (CC for continuous controls, Note for triggers/buttons)
- Is this a **relative encoder** or **absolute fader**? (Critical for transform choice)
- Clue: "knob" often means encoder; "fader/slider" means absolute
- When unsure: ASK the user or use MIDI Learn
**2. Trigger Selection**
- Simple press → `Note`
- Velocity-sensitive → `VelocityRange` (ask about soft/medium/hard thresholds)
- Hold behavior → `LongPress` (ask about duration if not specified)
- Quick double-press → `DoubleTap`
- Multiple simultaneous → `NoteChord`
- Continuous control → `CC` or `EncoderTurn`
**3. Action Selection**
- Keyboard shortcut → `Keystroke`
- Launch app → `Launch`
- Script execution → `Shell` (validate path exists!)
- Mode switching → `ModeChange`
- MIDI output → `SendMIDI`
## Common Pitfalls
| Pitfall | Why It Happens | How to Avoid |
|---------|----------------|--------------|
| Wrong CC number | Assumed instead of discovered | Use MIDI Learn or ask user |
| Encoder vs fader confusion | "Knob" is ambiguous | Ask: "Does it spin forever or have endpoints?" |
| Value range mismatch | 0-127 vs 0.0-1.0 | Check target application's expected range |
| Mapping conflict | Same trigger, different actions | Check existing mappings first |
## Execution Rules
**To implement changes, you MUST:**
1. Use `conductor_get_config` to read current state
2. Use `conductor_create_mapping` (returns Plan, not immediate change)
3. Present the Plan diff to user
4. Only proceed when user explicitly approves
**DO NOT:**
- Generate Python scripts to edit config files directly
- Output raw JSON and tell user to paste it
- Bypass Plan/Apply for "simple" changes
## Quick Reference Tables
### Trigger Types
| User Says | Trigger Type | Example |
|-----------|--------------|---------|
| "when I press pad 36" | Note | `{ type: "Note", note: 36 }` |
| "when I hit it hard" | VelocityRange | `{ type: "VelocityRange", note: 36, min_velocity: 100, max_velocity: 127 }` |
| "when I hold the button" | LongPress | `{ type: "LongPress", note: 36, duration_ms: 2000 }` |
| "when I double-tap" | DoubleTap | `{ type: "DoubleTap", note: 36, timeout_ms: 300 }` |
| "when I press multiple pads" | NoteChord | `{ type: "NoteChord", notes: [36, 37, 38] }` |
| "when I turn the knob" | EncoderTurn | `{ type: "EncoderTurn", cc: 16, direction: "any" }` |
### Action Types
| User Says | Action Type | Example |
|-----------|-------------|---------|
| "copy" / "Cmd+C" | Keystroke | `{ type: "Keystroke", keys: ["cmd", "c"] }` |
| "open Safari" | Launch | `{ type: "Launch", app: "Safari" }` |
| "run a script" | Shell | `{ type: "Shell", command: "~/scripts/foo.sh" }` |
| "switch to DJ mode" | ModeChange | `{ type: "ModeChange", mode: "DJ" }` |
| "send MIDI note" | SendMIDI | `{ type: "SendMIDI", message_type: "NoteOn", channel: 1, note: 60 }` |
## Error Recovery
**"No MIDI devices found"**
- Check: Is Conductor daemon running? (`conductor status`)
- Check: OS permissions for MIDI access
- Guide user to system MIDI preferences
**"Mapping conflict detected"**
- Present both mappings to user
- Ask which takes priority
- NEVER auto-resolve conflicts
**"Unknown note/CC number"**
- Suggest using MIDI Learn mode
- Guide user: "Press the control you want to map"
See [TRIGGERS.md](https://github.com/amiable-dev/conductor/blob/741b613135a07d04ba7f17c310c076329f7eef36/docs/adrs/references/TRIGGERS.md) for complete trigger documentation.
See [ACTIONS.md](https://github.com/amiable-dev/conductor/blob/741b613135a07d04ba7f17c310c076329f7eef36/docs/adrs/references/ACTIONS.md) for complete action documentation.
MIDI Learn Skill: conductor-midi-learn/SKILL.md
---
name: conductor-midi-learn
description: >
Guide users through MIDI Learn mode to capture controller inputs and create
mappings. Use when the user wants to "learn" or "capture" what their controller
does, or when they don't know the MIDI note numbers for their pads/knobs.
Detects patterns like LongPress, DoubleTap, and Chords automatically.
license: Apache-2.0
compatibility: Requires Conductor daemon with MIDI device connected
metadata:
author: amiable
version: "4.11.0"
category: midi
---
# MIDI Learn Mode
Help users discover their controller's MIDI messages and create mappings from
captured events.
## Workflow
1. **Start Capture**: Use `conductor_start_midi_learn` tool
2. **Guide the User**: Ask them to press/turn the controls they want to map
3. **Analyze Events**: Use `conductor_get_captured_events` to see what was captured
4. **Detect Patterns**: Identify if events suggest LongPress, DoubleTap, or Chord triggers
5. **Suggest Mappings**: Propose mappings based on captured events
6. **Apply with Approval**: Use Plan/Apply pattern for user confirmation
## Pattern Detection
When analyzing captured events, look for:
| Pattern | Detection Criteria | Suggested Trigger |
|---------|-------------------|-------------------|
| LongPress | NoteOn duration > 500ms before NoteOff | `LongPress` with detected duration |
| DoubleTap | Same note twice within 400ms | `DoubleTap` with detected interval |
| Chord | Multiple notes within 50ms window | `NoteChord` with detected notes |
| Velocity Layers | Same note at different velocities | Multiple `VelocityRange` mappings |
## Example Session
User: "I want to set up my Launchpad" Agent: Let me start MIDI Learn mode. Press the pads you want to configure. [Uses conductor_start_midi_learn]
User: [Presses pads] Agent: [Uses conductor_get_captured_events] I detected:
- Pad at note 36 (bottom-left)
- Pad at note 37 (next to it) What would you like these to do?
See [PATTERNS.md](https://github.com/amiable-dev/conductor/blob/741b613135a07d04ba7f17c310c076329f7eef36/docs/adrs/references/PATTERNS.md) for advanced pattern detection.
Progressive Disclosure Model
Following the Agent Skills specification, Conductor skills use three-level progressive disclosure:
Level 1: Metadata (~100 tokens)
├── name: "conductor-midi-mapping"
├── description: "Create and manage MIDI controller mappings..."
└── Loaded at startup for ALL skills
Level 2: Instructions (~2000-5000 tokens)
├── Full SKILL.md body
├── Workflow steps, quick reference tables
└── Loaded only when skill is activated
Level 3: Resources (as needed)
├── references/TRIGGERS.md - Complete trigger documentation
├── references/ACTIONS.md - Complete action documentation
└── Loaded only when agent needs specific details
This approach allows Conductor to provide comprehensive documentation without overwhelming the context window. An agent can have access to all Conductor skills (~400 tokens for metadata) while only loading full instructions when needed.
Skills + MCP Integration
Skills and MCP work together in Conductor:
┌─────────────────────────────────────────────────────────────┐
│ User Request │
│ "Set up pad 36 to copy when I tap it, paste when I hold" │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Agent Skills (Knowledge Layer) │
│ conductor-midi-mapping SKILL.md provides: │
│ - Understanding that this needs two triggers (Note, LongPress)
│ - Knowledge of Keystroke action format │
│ - Workflow: parse → plan → apply │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ MCP Tools (Capability Layer) │
│ conductor_create_mapping executes: │
│ - Validates trigger/action schemas │
│ - Generates ConfigPlan for review │
│ - Applies changes after user approval │
└─────────────────────────────────────────────────────────────┘
Skill Installation
Users can install Conductor skills via:
# From Conductor's bundled skills
conductor skills install conductor-midi-mapping
# From GitHub
conductor skills install github:amiable/conductor-skills/midi-mapping
# From local directory
conductor skills install ./my-custom-skill
Skills are installed to ~/.conductor/skills/ and automatically discovered by compatible agents.
8. GUI Chat Interface
Location: conductor-gui/ui/src/lib/views/ChatView.svelte
<script>
import { llmStore, chatStore } from '../stores/llm.js';
import { configStore } from '../stores.js';
let inputMessage = '';
let isStreaming = false;
async function sendMessage() {
if (!inputMessage.trim()) return;
// Add user message
chatStore.addMessage({ role: 'user', content: inputMessage });
const userInput = inputMessage;
inputMessage = '';
// Stream LLM response
isStreaming = true;
try {
await llmStore.chat(userInput, {
onToken: (token) => chatStore.appendToLastMessage(token),
onToolCall: (tool, result) => chatStore.addToolResult(tool, result),
onComplete: () => isStreaming = false,
});
} catch (error) {
chatStore.addMessage({ role: 'system', content: `Error: ${error.message}` });
isStreaming = false;
}
}
</script>
<div class="chat-container">
<div class="messages">
{#each $chatStore.messages as message}
<div class="message {message.role}">
{#if message.role === 'tool'}
<ToolResultCard tool={message.tool} result={message.result} />
{:else}
<MessageBubble {message} />
{/if}
</div>
{/each}
{#if isStreaming}
<StreamingIndicator />
{/if}
</div>
<div class="input-area">
<textarea
bind:value={inputMessage}
placeholder="Describe what you want to configure..."
on:keydown={(e) => e.key === 'Enter' && !e.shiftKey && sendMessage()}
/>
<button on:click={sendMessage} disabled={isStreaming}>
Send
</button>
</div>
</div>
Security Considerations
| Concern | Mitigation |
|---|---|
| API Key Storage | Encrypted in OS keychain (keytar), never in plain config |
| Shell Command Injection | LLM cannot execute arbitrary shell commands; only predefined actions |
| MCP Authentication | Optional token-based auth for remote access |
| Tool Allowlisting | Admin can restrict which tools are available to LLM |
| Rate Limiting | Per-provider rate limits to prevent API abuse |
| Config Validation | All LLM-generated configs validated before application |
| Audit Logging | All LLM actions logged with timestamps for review |
Configuration Schema
# ~/.conductor/config.toml
[llm]
enabled = true
default_provider = "anthropic"
[llm.providers.anthropic]
api_key_env = "ANTHROPIC_API_KEY" # Reference env var
model = "claude-3-5-sonnet-20241022"
max_tokens = 4096
temperature = 0.7
[llm.providers.openai]
api_key_env = "OPENAI_API_KEY"
model = "gpt-4o"
max_tokens = 4096
[llm.providers.openrouter]
api_key_env = "OPENROUTER_API_KEY"
base_url = "https://openrouter.ai/api/v1"
model = "anthropic/claude-3.5-sonnet"
[llm.providers.litellm]
base_url = "http://localhost:8000"
model = "gpt-4"
[llm.mcp]
enabled = true
socket_path = "~/.conductor/mcp.sock"
http_enabled = false
# http_port = 8080
# auth_token_env = "CONDUCTOR_MCP_TOKEN"
[llm.a2a]
enabled = true
agent_name = "Conductor MIDI Mapper"
Consequences
Positive
- Natural Language Configuration: Users can describe desired behavior in plain English
- Reduced Learning Curve: No need to understand trigger/action schemas upfront
- External Automation: Claude Code, Cursor, and other tools can configure Conductor
- Provider Flexibility: Users choose their preferred LLM (cost, privacy, capability)
- Extensibility: MCP/A2A enable future integrations without code changes
- Interactive Learning: LLM can guide users through MIDI Learn process
Negative
- API Costs: LLM API calls incur costs (mitigated by local LiteLLM option)
- Latency: LLM responses add latency vs. direct configuration
- Complexity: Additional infrastructure (MCP server, provider abstraction)
- Security Surface: New attack vectors through LLM tool execution
Neutral
- Optional Feature: Users can disable LLM integration entirely
- Backward Compatible: Existing config editing remains fully functional
Alternatives Considered
Alternative 1: Single Provider (Anthropic Only)
Rejected: Limits user choice, creates vendor lock-in, no offline option
Alternative 2: LLM in Daemon (Rust)
Rejected: Rust LLM libraries are less mature than JS/TS ecosystem; Tauri GUI already has web runtime
Alternative 3: External Chat Application
Rejected: Poor UX - users would need to context-switch between apps
Alternative 4: REST API Instead of MCP
Rejected: MCP is the emerging standard for LLM tool integration; REST would require custom client code
Implementation Plan
Phase 1A: Skills Foundation (v4.11.0) ✅ COMPLETE
- Agent Skills: Bundled SKILL.md files for core workflows
-
conductor-midi-mapping- Mapping creation skill (judgment-based) -
conductor-midi-learn- MIDI Learn guidance skill -
conductor-device-setup- Device connection skill
-
- Skills validation tooling (
conductor skills validate) - Cross-platform testing (Claude Code, Cursor, VS Code Copilot)
- Chat UI component (skills-only mode)
Phase 1B: ReadOnly MCP Tools (v4.11.1) ✅ COMPLETE
Rationale: "Skills without Tools are knowledge without hands." LLMs need to read current state for skill instructions to be useful and grounded.
- MCP Server with ReadOnly tools only:
-
conductor_get_status- Daemon status -
conductor_list_devices- Available devices -
conductor_get_config- Current configuration -
conductor_list_mappings- Mappings by mode -
conductor_get_mapping- Single mapping details
-
- LLM provider abstraction layer with capability negotiation
- OpenAI and Anthropic provider implementations
- API key management (keychain storage)
Phase 2: Mutations & Plan/Apply (v4.12.0) ✅ COMPLETE
- Transport-agnostic ToolExecutor with risk tiers
- Plan/Apply pattern with TOCTOU protection (base_state_hash)
- Plan expiration (5 minute TTL)
- ConfigChange tools:
-
conductor_create_mapping(returns Plan) -
conductor_update_mapping(returns Plan) -
conductor_delete_mapping(returns Plan)
-
- Stateful tools:
-
conductor_start_midi_learn -
conductor_stop_midi_learn
-
- Tauri event system for state synchronization
- Chat UI with plan review modal
- Agentic tool loop in chat store
Phase 3: Enhanced Features (v4.13.0-v4.15.0) ✅ COMPLETE
- OpenRouter and LiteLLM support (v4.13.0)
- Streaming responses with graceful degradation (v4.13.0)
- Tool result visualization in chat (v4.13.0)
- Google Gemini support (v4.13.0)
- Batch operations support (v4.13.0)
- Conversation history persistence (SQLite) (v4.15.0 - P3-05)
- Conversations persist across app restarts
- History sidebar with conversation list
- Load, delete, and create new conversations
- Messages persisted with tool calls
- Cost tracking display (v4.15.0 - P3-06)
- Total usage cost summary
- Per-conversation cost display
- Breakdown by provider and model
- Token usage statistics
Phase 4: Advanced MCP & Security (v4.14.0) ✅ COMPLETE
- HardwareIO tier tools (SysEx with multi-step confirmation)
- Audit logging for all tool executions (SQLite
conductor.db) - Rate limiting per provider/client (token bucket)
- Undo/redo support for config changes (history stack)
- Atomic rollback on failed applies (transaction wrapper)
- Remote MCP access with authentication (OAuth2) - deferred
Phase 5: Polish & Future (v4.16.0+)
- Multi-turn conversation context optimization
- Suggestion chips for common actions
- A2A protocol support (deferred)
- User-provided skills with sandboxing
- Voice input (optional)
- Offline mode with local models
TDD Test Plan
Unit Tests
test_provider_trait_implementation- Each provider implements trait correctlytest_tool_parameter_validation- Invalid tool params rejectedtest_tool_execution_success- Tools modify config correctlytest_tool_execution_rollback- Failed tools don't corrupt statetest_api_key_encryption- Keys stored encryptedtest_mcp_tool_manifest- MCP tools match internal toolstest_tool_risk_tier_classification- All tools have correct risk tiertest_plan_expiry- Plans auto-expire after timeout
Agent Skills Tests
test_skill_md_frontmatter_valid- All SKILL.md files have valid YAML frontmattertest_skill_md_name_matches_directory- Skill name matches parent directorytest_skill_references_exist- All referenced files in skills existtest_skill_progressive_disclosure- Metadata < 200 tokens, full < 5000 tokenstest_skill_cross_platform_compat- Skills validate against agentskills.io schema
Integration Tests
test_chat_roundtrip- User message → LLM → tool → responsetest_midi_learn_with_llm- LLM coordinates capture and mapping creationtest_mcp_external_client- External tool calls execute correctlytest_provider_failover- Graceful handling of provider errorstest_skill_activation_loads_context- Activating skill loads SKILL.md bodytest_skill_mcp_tool_coordination- Skills guide MCP tool usage correctly
E2E Tests
test_full_mapping_workflow- "Create a mapping for pad 36 to copy" → working mappingtest_multi_turn_conversation- Context maintained across messagestest_skill_guided_midi_learn- Skill guides user through MIDI Learn workflow
Requirements Traceability
| Requirement | Component | Test Coverage |
|---|---|---|
| R1: Chat Interface | ChatView.svelte, llm/chat.rs | test_chat_roundtrip |
| R2: LLM Config Management | llm/tools.rs, ConductorTool enum | test_tool_execution_* |
| R3: Multi-Provider Support | llm/providers/*.rs | test_provider_trait_* |
| R4: MCP/Skills | mcp/server.rs, skills/*.md | test_mcp_, test_skill_ |
| R5: Security Boundaries | tool allowlisting, Plan/Apply | test_tool_parameter_validation, test_plan_* |
| R6: Cross-Platform Skills | ~/.conductor/skills/ | test_skill_cross_platform_compat |
Open Questions
Conversation Persistence: Should chat history persist across sessions? If so, where?- RESOLVED (v4.15.0): Yes, SQLite database stores conversations with messages and tool calls
- Multi-User: If MCP is exposed over HTTP, how to handle multiple concurrent users?
- Model Selection UI: Should users be able to switch models mid-conversation?
Cost Tracking: Should we display estimated API costs to users?- RESOLVED (v4.15.0): Yes, CostSummaryPanel displays total, per-conversation, and breakdown by provider/model
References
Agent Skills
- Agent Skills Specification - Official SKILL.md format specification
- Anthropic Skills Repository - Official example skills
- Skills vs MCP Comparison - When to use which
- Claude Agent Skills Deep Dive - Technical analysis
- VS Code Agent Skills Guide - Cross-platform usage
MCP (Model Context Protocol)
- MCP Specification - Official protocol specification
- MCP Best Practices - Architecture & implementation guide
- MCP Security Considerations - Security risks and controls
Other
- A2A Protocol Draft - Agent-to-Agent protocol (deferred)
- OpenRouter API - Multi-model router
- LiteLLM Proxy - Self-hosted LLM proxy
- Tauri Security - GUI security model
LLM Council Review
Review 1: Initial Architecture (2026-01-31)
Verdict: APPROVED with Critical Architectural Modifications Consensus: High (CSS: 0.92)
The LLM Council approved ADR-007 as a comprehensive and well-structured architecture for LLM integration. Critical modifications applied:
- Plan/Apply Pattern - Config changes require user approval via diff preview
- Tool Risk Tiers - Four-tier classification (ReadOnly, Stateful, ConfigChange, Privileged)
- Transport-Agnostic Tool Executor - Unified logic for GUI and MCP paths
- State Synchronization - Tauri Events for real-time propagation
- Capability Negotiation - Runtime detection with graceful degradation
- A2A Deferral - Focus on MCP first
Review 2: Agent Skills Integration (2026-02-01)
Verdict: APPROVED with Critical Modifications Consensus: High
After adding Agent Skills (SKILL.md per agentskills.io spec), the council provided additional critical feedback:
Critical Modifications Applied
1. TOCTOU Vulnerability Fix (Critical)
- Issue: Plans could become stale if config changes between plan and apply
- Resolution: Added
base_state_hashto ConfigPlan; apply verifies hash matches current state
2. Hardware I/O Risk Tier (Critical)
- Issue: SysEx messages can brick MIDI hardware but weren't specially protected
- Resolution: Added
HardwareIOtier with multi-step confirmation for dangerous operations
3. SKILL.md Content Style (Critical)
- Issue: Skills were procedural ("Step 1, 2, 3") instead of judgment-based
- Resolution: Rewrote skills to focus on mental models, decision frameworks, and failure modes
4. Skills Must Include Safety Policy (Important)
- Issue: No explicit rules about what skills cannot do
- Resolution: Added Scope/Non-Goals and Safety Policy sections to all skills
5. Implementation Phasing Correction (Important)
- Issue: Skills in Phase 1 without Tools means "knowledge without hands"
- Resolution: Added Phase 1B with ReadOnly MCP tools so LLM can see current state
6. Skills-to-Tools Execution Guidance (Important)
- Issue: LLM might try to edit files directly instead of using MCP tools
- Resolution: Added explicit "Execution Rules" section: MUST use MCP tools, DO NOT generate scripts
7. Treat Skills as Untrusted (Important)
- Issue: Malicious SKILL.md could try to bypass safety controls
- Resolution: Runtime (MCP layer) enforces all boundaries; skills are guidance only
Council Quality Metrics (Review 2)
| Metric | Score | Notes |
|---|---|---|
| Consensus Strength (CSS) | 0.88 | Strong agreement on security fixes |
| Deliberation Depth (DDI) | 0.91 | Thorough analysis of Skills+MCP interaction |
| Synthesis Attribution (SAS) | 0.93 | All modifications traceable |
Model Rankings
| Model | Score | Key Contribution |
|---|---|---|
| anthropic/claude-opus-4.5 | 0.833 | SKILL.md rewrite guidance, trust boundaries |
| openai/gpt-5.2 | 0.333 | TOCTOU vulnerability identification |
| x-ai/grok-4.1-fast | 0.333 | Skill-MCP contract validation |
| google/gemini-3-pro-preview | 0.333 | Hardware I/O risk tier |
Dissenting Views (Review 2)
Minor dissent on skill content style: One council member suggested keeping some procedural content for "quick reference" alongside judgment-based guidance. The compromise was to include Quick Reference Tables at the end of skills after the decision frameworks.
Verification Requirements (Combined)
- Security Review: Plan/Apply with TOCTOU protection must be penetration tested
- UX Testing: Plan review modal clarity with non-technical users
- Integration Tests: ToolExecutor 100% coverage including HardwareIO tier
- Skills Validation: All SKILL.md files must pass agentskills.io schema validation
- Platform Test Matrix: Validate skills across Claude Code, Cursor, VS Code Copilot, Codex CLI
- Token Budget Testing: Verify skills stay under recommended token limits